Understanding Social Media Texts with Minimum Human Effort on #Twitter
Résumé
Named Entity Recognition (NER) is a traditional Natural Language Processing (NLP) task. But traditional machine learning methods are facing new problems to handle this task with Social Media data like Twitter. In this new context, the performance is often degraded.
The Twitter messages have particular features. Consider the example "Today wasz Fun cusz anna Came juss for me <3: hahaha".
In this example, the difficulties are manifold: 1) Spelling mistakes: wasz (was), cusz (because), juss (just); 2) Uppercase/lowercase inversion: Fun (fun), 3) anna (Anna), Came (came); 4) Emoticon: <3; 5) Interjection: hahaha.
The alternation of uppercase/lowercase is a major problem for the NER task because the only person proper noun "anna" of our tweet begins with a lowercase instead of an uppercase, like in grammatically well-formed texts.
In this paper, we present our work on recognizing named entities on Twitter.
Origine : Fichiers produits par l'(les) auteur(s)
Loading...