Skip to Main content Skip to Navigation
Journal articles

Combining Sequence and Itemset Mining to Discover Named Entities in Biomedical Texts: A New Type of Pattern

Marc Plantevit 1 Thierry Charnois 1 Jiri Kléma 2 Christophe Rigotti 3 Bruno Crémilleux 1
1 Equipe CODAG - Laboratoire GREYC - UMR6072
GREYC - Groupe de Recherche en Informatique, Image, Automatique et Instrumentation de Caen
3 DM2L - Data Mining and Machine Learning
LIRIS - Laboratoire d'InfoRmatique en Image et Systèmes d'information
Abstract : Biomedical named entity recognition (NER) is a challenging problem. In this paper, we show that mining techniques, such as sequential pattern mining and sequential rule mining, can be useful to tackle this problem but present some limitations. We demonstrate and analyse these limitations and introduce a new kind of pattern called LSR pattern that offers an excellent trade-off between the high precision of sequential rules and the high recall of sequential patterns. We formalise the LSR pattern mining problem first. Then we show how LSR patterns enable us to successfully tackle biomedical NER problems. We report experiments carried out on real datasets that underline the relevance of our proposition.
Complete list of metadatas

Cited literature [35 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-01011378
Contributor : Greyc Référent <>
Submitted on : Monday, June 23, 2014 - 4:48:10 PM
Last modification on : Friday, February 14, 2020 - 2:04:05 PM
Document(s) archivé(s) le : Tuesday, September 23, 2014 - 12:05:12 PM

File

RIACL-PLANTEVIT-2009-3.pdf
Publisher files allowed on an open archive

Identifiers

Citation

Marc Plantevit, Thierry Charnois, Jiri Kléma, Christophe Rigotti, Bruno Crémilleux. Combining Sequence and Itemset Mining to Discover Named Entities in Biomedical Texts: A New Type of Pattern. International Journal of Data Mining, Modelling and Management, Inderscience, 2009, 1 (2), pp.119-148. ⟨10.1504/IJDMMM.2009.026073⟩. ⟨hal-01011378⟩

Share

Metrics

Record views

535

Files downloads

327