Information Digestion

Gaël Dias

Habilitation À Diriger Des Recherches Year : 2010

Information Digestion

(1)

Gaël Dias

Function : Author
PersonId : 3735
IdHAL : gael-dias
ORCID : 0000-0002-5840-1603
IdRef : 113779747

hultech

Abstract

The World Wide Web (WWW) is a huge information network within which searching for relevant quality contents remains an open question. The ambiguity of natural language is traditionally one of the main reasons, which prevents search engines from retrieving information according to users' needs. However, the globalized access to the WWW via Weblogs or social networks has highlighted new problems. Web documents tend to be subjective, they mainly refer to actual events to the detriment of past events and their ever growing number contributes to the well-known problem of information overload. In this thesis, we present our contributions to digest information in real-world heterogeneous text environments (i.e. the Web) thus leveraging users' efforts to encounter relevant quality information. However, most of the works related to Information Digestion deal with the English language fostered by freely available linguistic tools and resources, and as such, cannot be directly replicated for other languages. To overcome this drawback, two directions may be followed: on the one hand, building resources and tools for a given language, or on the other hand, proposing language-independent approaches. Within the context of this report, we will focus on presenting language-independent unsupervised methodologies to (1) extract implicit knowledge about the language and (2) understand the explicit information conveyed by real-world texts, thus allowing to reach Multilingual Information Digestion.

Keywords

Unsupervised language-independent approaches Information Digestion Real-world text environments Word semantic relations Explicit and implicit knowledge extraction Sentiment analysis

méthodes non-supervisées indépendantes de la langue digestion d'information environnement textuels relation sémantiques entre mots extraction de connaissances explicites et implicites analyse de sentiments

Domains

Machine Learning [cs.LG] Artificial Intelligence [cs.AI]

Fichier principal

Thesis-HDR.pdf (6.15 Mo)

Denys Duchier : Connect in order to contact the contributor

https://theses.hal.science/tel-00669780

Submitted on : Monday, February 13, 2012-7:31:39 PM

Last modification on : Saturday, June 25, 2022-10:11:49 AM

Long-term archiving on: Monday, May 14, 2012-2:55:16 AM

Dates and versions

tel-00669780 , version 1 (13-02-2012)

Identifiers

HAL Id : tel-00669780 , version 1

Cite

Gaël Dias. Information Digestion. Machine Learning [cs.LG]. Université d'Orléans, 2010. ⟨tel-00669780⟩

Export

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-ORLEANS MSL MSL-THESE

320 View

602 Download

Information Digestion

Abstract

Keywords

Domains

Dates and versions

Identifiers

Cite

Export

Collections

Share