Skip to Main content Skip to Navigation
Journal articles

Construction d'un corpus parallèle à partir de corpus comparables pour la simplification de textes médicaux en français

Abstract : The purpose of automatic simplification is to create version of texts which is easier to understand for a given targeted population. We aim at simplifying medical texts. Usually, lexicon and rules required for the simplification are acquired from parallel corpora. Since such corpora are not available for French, we propose methods for their creation from comparable corpora. Our method relies on filtering step, which purpose is to keep the best sentence candidates for alignment, and alignment step considered as categorization problem. The aim is to decide whether a pair of sentences is alignable or not. We exploit different types of features (mainly issued from lexicon and corpora) and get up to 0.97 F-measure with balanced data.
Document type :
Journal articles
Complete list of metadata

https://hal.archives-ouvertes.fr/hal-03094983
Contributor : Natalia Grabar <>
Submitted on : Wednesday, June 23, 2021 - 11:00:47 AM
Last modification on : Tuesday, June 29, 2021 - 2:38:09 PM

File

tal61-2_cardon_ok.pdf
Publisher files allowed on an open archive

Identifiers

  • HAL Id : hal-03094983, version 1

Collections

Citation

Rémi Cardon, Natalia Grabar. Construction d'un corpus parallèle à partir de corpus comparables pour la simplification de textes médicaux en français. Revue TAL, ATALA (Association pour le Traitement Automatique des Langues), 2020. ⟨hal-03094983⟩

Share

Metrics

Record views

12

Files downloads

18