A Corpus for Hybrid Question Answering Systems
Résumé
Question answering has been the focus of a lot of researches and evaluation campaigns, either for text-based systems (TREC and CLEF evaluation campaigns for example), or for knowledge-based systems (QALD, BioASQ). Few systems have effectively combined both types of resources and methods in order to exploit the fruitful- ness of merging the two kinds of information repositories. The only evaluation QA track that focuses on hybrid QA is QALD since 2014. As it is a recent task, few annotated data are available (around 150 questions). In this paper, we present a question answering dataset that was constructed to develop and evaluate hybrid question an- swering systems. In order to create this corpus, we collected several textual corpora and augmented them with entities and relations of a knowledge base by retrieving paths in the knowledge base which allow to answer the questions. The resulting corpus contains 4300 question-answer pairs and 1600 have a true link with DBpedia.
Origine : Fichiers éditeurs autorisés sur une archive ouverte
Loading...