Wikipedia-based extraction of key information from resumes - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2017

Wikipedia-based extraction of key information from resumes

Résumé

Abstract—There is a vast amount of information about indi- viduals available on the Web that has potential uses in Human Resource Management (HRM) - both for recruiters and job seekers. Since people names are inherently ambiguous, finding information related to a specific person is challenging and a simple query by name will likely return web pages related to several different individuals who happen to share the same name as the target of the query. In the context of a HRM application, one way to narrow down the search to the correct individual is to complete the query with key information that can be extracted from the resume of that individual. Examples of key information are the schools attended by the individual, her degrees, the companies she worked for and the people she collaborated with. In order to automate this process, we need tools that are able to annotate a resume so that the key information can be easily extracted. The numerous existing tools for the annotation of generic text documents are unfit because resumes generally do not strictly follow natural language constructs, such as paragraphs and sentences, and often mention entities in languages (e.g., “Politecnico di Milano”, in Italian) that differ from the language in which the resumes are written (e.g., English). In this paper we propose an approach that uses Wikipedia, the largest multilingual encyclopedia to date, to automatically annotate key information in resumes. The major strengths of the approach is that it is language-independent and is not supervised, thus it requires no manually labeled training data. Our evaluation shows results with high precision and recall and outperforms TagMe and Babelfy, two prominent existing annotators.
Fichier non déposé

Dates et versions

hal-01764238 , version 1 (11-04-2018)

Identifiants

Citer

Mohammad Ghufran, Gianluca Quercini, Nacéra Bennacer Seghouani. Wikipedia-based extraction of key information from resumes. 2017 11th International Conference on Research Challenges in Information Science (RCIS), May 2017, Brighton, United Kingdom. ⟨10.1109/RCIS.2017.7956530⟩. ⟨hal-01764238⟩
228 Consultations
0 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More