Stochastic Models for Surface Information Extraction in Texts

Abstract : We describe the application of numerical machine learning techniques to the extraction of information from a collection of textual data. More precisely, we consider the modeling of text sequences with hidden Markov models and multilayer perceptrons and show how these models can be used to perform specific surface extraction tasks (i.e. tasks which do not need in depth syntactic or semantic analysis). We consider different text representations using semantic and syntactic knowledge and analyze the influence of different grammatical constraints on the models using the MUC-6 corpus.
Document type :
Conference papers
Complete list of metadatas
Contributor : Lip6 Publications <>
Submitted on : Monday, August 14, 2017 - 5:48:47 PM
Last modification on : Thursday, March 21, 2019 - 1:06:22 PM



Massih-Reza Amini, Hugo Zaragoza, Patrick Gallinari. Stochastic Models for Surface Information Extraction in Texts. 9th International Conference of Artificial Neural Networks, Sep 1999, Edinburgh, United Kingdom. pp.892-897, ⟨10.1049/cp:19991225⟩. ⟨hal-01574502⟩



Record views