Skip to Main content Skip to Navigation
Conference papers

Structure patterns in Information Extraction: a multilingual solution?

Gaël Lejeune 1, 2
1 Equipe Hultech - Laboratoire GREYC - UMR6072
GREYC - Groupe de Recherche en Informatique, Image et Instrumentation de Caen
Abstract : IE systems nowadays work very well, but they are mostly monolingual and difficult to convert to other languages. We maybe have then to stop thinking only with traditional pattern-based approaches. Our project, PULS, makes epidemic surveillance through analysis of On-Line News in collaboration with MedISys, developed at the European Commission's Joint Research Centre (EC-JRC). PULS had only an English pattern-based system and we worked on a pilot study on French to prepare a multilingual extension. We will present here why we chose to ignore classical approaches and how we can use it with a mainly language-independent based only on discourse properties of press articlestructure. Our results show a precision of 87% and a recall of 93%. And we have good reasons to think that this approach will also be efficient for other languages.
Document type :
Conference papers
Complete list of metadata

Cited literature [5 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-00605691
Contributor : Gaël Lejeune Connect in order to contact the contributor
Submitted on : Monday, July 4, 2011 - 4:40:36 AM
Last modification on : Tuesday, October 19, 2021 - 11:34:56 PM
Long-term archiving on: : Monday, November 12, 2012 - 9:56:25 AM

File

amict09.pdf
Publisher files allowed on an open archive

Identifiers

  • HAL Id : hal-00605691, version 1

Citation

Gaël Lejeune. Structure patterns in Information Extraction: a multilingual solution?. Advances in Methods of Information and Communication Technology, May 2009, Petrozavodsk, Russia. pp.105-111. ⟨hal-00605691⟩

Share

Metrics

Les métriques sont temporairement indisponibles