Population of a Knowledge Base for News Metadata from Unstructured Text and Web Data

Rosa Stern 1, 2, * Benoît Sagot 1
* Corresponding author
1 ALPAGE - Analyse Linguistique Profonde à Grande Echelle ; Large-scale deep linguistic processing
Inria Paris-Rocquencourt, UPD7 - Université Paris Diderot - Paris 7
Abstract : We present a practical use case of knowl- edge base (KB) population at the French news agency AFP. The target KB instances are en- tities relevant for news production and con- tent enrichment. In order to acquire uniquely identified entities over news wires, i.e. tex- tual data, and integrate the resulting KB in the Linked Data framework, a series of data mod- els need to be aligned: Web data resources are harvested for creating a wide coverage entity database, which is in turn used to link entities to their mentions in French news wires. Fi- nally, the extracted entities are selected for in- stantiation in the target KB. We describe our methodology along with the resources created and used for the target KB population.
Document type :
Conference papers
Complete list of metadatas

https://hal.archives-ouvertes.fr/hal-00699297
Contributor : Rosa Stern <>
Submitted on : Sunday, May 20, 2012 - 3:34:44 PM
Last modification on : Thursday, August 29, 2019 - 2:24:06 PM
Long-term archiving on : Friday, November 30, 2012 - 11:56:37 AM

File

naacl12akbc.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-00699297, version 1

Collections

Citation

Rosa Stern, Benoît Sagot. Population of a Knowledge Base for News Metadata from Unstructured Text and Web Data. AKBC-WEKEX 2012 - The Knowledge Extraction Workshop at NAACL-HLT 2012, Jun 2012, Montréal, Canada. ⟨hal-00699297⟩

Share

Metrics

Record views

462

Files downloads

187