Learning for Text Summarization using Labeled and Unlabeled Sentences.

Massih-Reza Amini Patrick Gallinari 1
1 APA - Apprentissage et Acquisition des connaissances
LIP6 - Laboratoire d'Informatique de Paris 6
Abstract : We describe an original machine learning approach for automatic text summarization; it works by extracting the most relevant sentences from a document. Since labeled corpora are difficult to collect for this task, we propose a semi-supervised method, which makes use of a small set of labeled sentences together with a large set of unlabeled documents, for improving the performances of summary systems. We show that this method is an instance of the Classification EM algorithm in the case of gaussian densities, and that it can also be used in a non-parametric setting. We finally provide an empirical evaluation on the Reuters news-wire corpus.
Document type :
Conference papers
Complete list of metadatas

Contributor : Lip6 Publications <>
Submitted on : Thursday, August 3, 2017 - 5:22:53 PM
Last modification on : Thursday, March 21, 2019 - 1:09:52 PM

Links full text



Massih-Reza Amini, Patrick Gallinari. Learning for Text Summarization using Labeled and Unlabeled Sentences.. ICANN 2001 - 11th International Conference of Artificial Neural Networks, Aug 2001, Vienna, Austria. pp.1177-1184, ⟨10.1007/3-540-44668-0_164⟩. ⟨hal-01571860⟩



Record views