A corpus of real-life questions for evaluating robustness of QA systems - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2008

A corpus of real-life questions for evaluating robustness of QA systems

Résumé

Many evaluation campaigns on question answering (QA) systems have been organized for years. The international TREC 1 conference proposes a track about it , the European CLEF 2 campaign proposes cross evaluation in eight languages, NTCIR 3 includes a QA track in three languages and the French Technolangue EQUER 4 evaluation focused on 500 corpus based questions. But the questions asked in those campaigns are checked for being well formed and enough complete. We aim to test QA systems on their ability to answer questions spontaneously typed by people without thinking deeply to the grammatical and lexical forms they might use. This is designed to test QA system robustness in more real life uses. Related experiments already done in document retrieval field test the robustness of search engine with automatic transcription of spoken queries (Crestani, 2000) or with automatically degraded text entries (Ruch, 2002). Moreover, we aim to test our QA system (Gillard et al., 2006) with questions written by dyslexic adults or children, or non-native speakers. This public is intuitively the most concerned with these problems of robustness. 1 Corpus constitution A web-based approach was used to acquire data for the corpus. The motivation for such an approach is two fold. Firstly, it lets users make the experiment in relaxing conditions, when and where they wish to. Secondly, it permits to collect data from a wide population, especially for dyslexics individuals already solicited for psycholinguistics experiments. It removes geographical constraints often restraining the quantity of data in this area. The experiment is composed of 20 questions selected from EQUER French evaluation campaign. The selected questions were some right answered by SQuALIA (Gillard et al., 2006). 8 of them contain proper nouns. 2 of them contain foreign low frequency proper nouns. The covered focuses are: person name (5 questions), number (5 questions), date (3 questions), location (2 questions), money, distance, age, journal name and military grade.
Fichier non déposé

Dates et versions

hal-01321115 , version 1 (25-05-2016)

Identifiants

  • HAL Id : hal-01321115 , version 1

Citer

Laurianne Sitbon, Patrice Bellot, Philippe Blache. A corpus of real-life questions for evaluating robustness of QA systems. LREC , May 2008, Marrakech, Morocco. ⟨hal-01321115⟩
50 Consultations
0 Téléchargements

Partager

Gmail Facebook X LinkedIn More