Evaluation of the Terminology Coverage in the French Corpus LiSSa.

Abstract : Extracting concepts from medical texts is a key to support many advanced applications in medical information retrieval. Entity recognition in French texts is moreover challenged by the availability of many resources originally developed for English texts. This paper proposes an evaluation of the terminology coverage in a corpus of 50,000 French articles extracted from the bibliographic database LiSSa. This corpus was automatically indexed with 32 health terminologies, published in French or translated. Then, the terminologies providing the best coverage of these documents were determined. The results show that major resources such as the NCI and SNOMED CT thesauri achieve the largest annotation of the corpus while specific French resources prove to be valuable assets.
Document type :
Journal articles
Complete list of metadatas

https://hal.archives-ouvertes.fr/hal-01843039
Contributor : Lina F Soualmia <>
Submitted on : Wednesday, July 18, 2018 - 2:11:17 PM
Last modification on : Wednesday, October 16, 2019 - 1:50:03 PM

Identifiers

  • HAL Id : hal-01843039, version 1
  • PUBMED : 28423768

Citation

Chloé Cabot, Lina F. Soualmia, Julien Grosjean, Nicolas Griffon, Stéfan Darmoni. Evaluation of the Terminology Coverage in the French Corpus LiSSa.. Studies in Health Technology and Informatics, IOS Press, 2018, pp.126-130. ⟨hal-01843039⟩

Share

Metrics

Record views

97