%0 Conference Paper %F Oral %T ECSTRA-INSERM @ CLEF eHealth2016-task 2: ICD10 Code Extraction from Death Certificates %+ Equipe de Recherche en Ingénierie des Connaissances (ERIC) %+ Université Pierre et Marie Curie - Paris 6 (UPMC) %+ ORS PACA %+ Biostatistique et épidemiologie clinique %+ Entrepôts, Représentation et Ingénierie des Connaissances (ERIC) %+ CHU Saint-Antoine [AP-HP] %A Dermouche, Mohamed %A Looten, Vincent %A Flicoteaux, Rémi %A Chevret, Sylvie %A Velcin, Julien %A Taright, Namik %< avec comité de lecture %B Conference and Labs of the Evaluation Forum %C Evora, Portugal %8 2016-09-05 %D 2016 %K ICD10 code assignment %K cause of death extraction %K topic models %K machine learning %K natural language processing %K text mining %Z Computer Science [cs]/Document and Text Processing %Z Computer Science [cs]/Information Retrieval [cs.IR] %Z Computer Science [cs]/Artificial Intelligence [cs.AI] %Z Statistics [stat]/Machine Learning [stat.ML]Conference papers %X This paper describes the participation of ECSTRA-INSERM team at CLEF eHealth 2016, task 2.C. The task involves extracting ICD10 codes from death certificates, mainly described with short plain texts. We cast the task as a machine learning problem involving the prediction of the ICD10 codes (categorical variable) from the raw text transformed into a bag-of-words matrix. We rely on probabilistic topic models that we evaluate against classical classifiers such as SVM and Naive Bayes. We demonstrate the effectiveness of topic models for this task in terms of prediction accuracy and result interpretation. %G English %2 https://hal.science/hal-02052331/document %2 https://hal.science/hal-02052331/file/16090061.pdf %L hal-02052331 %U https://hal.science/hal-02052331 %~ UNIV-PARIS7 %~ UPMC %~ UNIV-LYON1 %~ UNIV-LYON2 %~ APHP %~ ERIC %~ USPC %~ LYON2 %~ SORBONNE-UNIVERSITE %~ SU-INF-2018 %~ SU-MEDECINE %~ SU-MED %~ UDL %~ UNIV-LYON %~ UNIV-PARIS %~ SU-TI %~ ALLIANCE-SU