Skip to Main content Skip to Navigation
Conference papers

Word Embedding for French Natural Language in Healthcare: A Comparative Study.

Abstract : Structuring raw medical documents with ontology mapping is now the next step for medical intelligence. Deep learning models take as input mathematically embedded information, such as encoded texts. To do so, word embedding methods can represent every word from a text as a fixed-length vector. A formal evaluation of three word embedding methods has been performed on raw medical documents. The data corresponds to more than 12M diverse documents produced in the Rouen hospital (drug prescriptions, discharge and surgery summaries, inter-services letters, etc.). Automatic and manual validation demonstrates that Word2Vec based on the skip-gram architecture had the best rate on three out of four accuracy tests. This model will now be used as the first layer of an AI-based semantic annotator.
Complete list of metadatas

https://hal.archives-ouvertes.fr/hal-02409087
Contributor : Stéfan Darmoni <>
Submitted on : Friday, December 13, 2019 - 11:54:31 AM
Last modification on : Saturday, February 15, 2020 - 1:44:42 AM

Identifiers

Citation

Emeric Dynomant, Romain Lelong, Badisse Dahamna, Clément Massonnaud, Gaetan Kerdelhué, et al.. Word Embedding for French Natural Language in Healthcare: A Comparative Study.. MEDINFO 2019: Health and Wellbeing e-Networks for All, Aug 2019, lyon, France. pp.118-122, ⟨10.3233/SHTI190195⟩. ⟨hal-02409087⟩

Share

Metrics

Record views

53