CAS: French Corpus with Clinical Cases

Abstract : Textual corpora are extremely important for various NLP applications as they provide information necessary for creating, setting and testing these applications and the corresponding tools. They are also crucial for designing reliable methods and reproducible results. Yet, in some areas, such as the medical area, due to confidentiality or to ethical reasons, it is complicated and even impossible to access textual data representative of those produced in these areas. We propose the CAS corpus built with clinical cases, such as they are reported in the published scientific literature in French. We describe this corpus, currently containing over 397,000 word occurrences, and the existing linguistic and semantic annotations.
Document type :
Conference papers
Liste complète des métadonnées
Contributor : Clément Dalloux <>
Submitted on : Tuesday, November 27, 2018 - 8:32:46 PM
Last modification on : Tuesday, December 4, 2018 - 1:10:35 AM


Files produced by the author(s)


  • HAL Id : hal-01937096, version 1


Natalia Grabar, Vincent Claveau, Clément Dalloux. CAS: French Corpus with Clinical Cases. LOUHI 2018 - The Ninth International Workshop on Health Text Mining and Information Analysis, Oct 2018, Bruxelles, France. pp.1-7, 2018, Ninth International Workshop on Health Text Mining and Information Analysis (LOUHI) Proceedings of the Workshop. 〈hal-01937096〉



Record views


Files downloads