Skip to Main content Skip to Navigation
Conference papers

Semantic Annotation of the ACL Anthology Corpus for the Automatic Analysis of Scientific Literature

Abstract : This paper describes the process of creating a corpus annotated for concepts and semantic relations in the scientific domain. A part of the ACL Anthology Corpus was selected for annotation, but the annotation process itself is not specific to the computational linguistics domain and could be applied to any scientific corpus. Concepts were identified and annotated fully automatically, based on a combination of terminology extraction and available ontological resources. A typology of semantic relations between concepts is also proposed. This typology, consisting of 18 domain-specific and 3 generic relations, is the result of a corpus-based investigation of the text sequences occurring between concepts in sentences. A sample of 500 abstracts from the corpus is currently being manually annotated with these semantic relations. Only explicit relations are taken into account, so that the data could serve to train or evaluate pattern-based semantic relation classification systems.
Document type :
Conference papers
Complete list of metadata

Cited literature [24 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-01360407
Contributor : Kata Gabor <>
Submitted on : Tuesday, September 6, 2016 - 3:16:31 PM
Last modification on : Tuesday, January 5, 2021 - 5:28:07 PM
Long-term archiving on: : Wednesday, December 7, 2016 - 12:32:47 PM

File

870_Paper.pdf
Publisher files allowed on an open archive

Identifiers

  • HAL Id : hal-01360407, version 1

Citation

Kata Gábor, Haïfa Zargayouna, Davide Buscaldi, Isabelle Tellier, Thierry Charnois. Semantic Annotation of the ACL Anthology Corpus for the Automatic Analysis of Scientific Literature. LREC 2016, May 2016, Portoroz, Slovenia. ⟨hal-01360407⟩

Share

Metrics

Record views

450

Files downloads

329