Automatic annotation of bibliographical references in digital humanities books, articles and blogs

Abstract : In this paper, we deal with the problem of extracting and processing useful information from bibliographic references in Digital Humanities (DH) data. A machine learning technique for sequential data analysis, Conditional Random Field is applied to a corpus extracted from OpenEdition site, a web platform for journals and book collections in the humanities and social sciences. We present our ongoing project with this purpose that includes the construction of a proper corpus and a efficient CRF model on this as a preliminary. This project is supported by Google Grant for Digital Humanities. A number of experiments are conducted to find one of the best settings for a CRF model on the corpus, and we verify them both in an automatic and manual way of evaluation.
Type de document :
Communication dans un congrès
CCS'11 the ACM Conference on Computer and Communications Security , Oct 2011, Chicago, United States. 2011, 〈10.1145/2064058.2064068〉
Liste complète des métadonnées

https://hal.archives-ouvertes.fr/hal-01317638
Contributeur : Bibliothèque Universitaire Déposants Hal-Avignon <>
Soumis le : mercredi 18 mai 2016 - 16:33:51
Dernière modification le : mercredi 28 septembre 2016 - 15:53:52

Identifiants

Collections

Citation

Young-Min Kim, Patrice Bellot, Elodie Faath, Marin Dacos. Automatic annotation of bibliographical references in digital humanities books, articles and blogs. CCS'11 the ACM Conference on Computer and Communications Security , Oct 2011, Chicago, United States. 2011, 〈10.1145/2064058.2064068〉. 〈hal-01317638〉

Partager

Métriques

Consultations de la notice

36