Extracting People's Hobby and Interest Information from Social Media Content
Résumé
In this study we investigate how to analyze people's social media profiles to extract hobby and interest information. We developed a baseline system that applies heuristic rules and TF-IDF term weighting method in determining the most representative terms indicating hobbies and interests. A pilot test was done to collect feedback from users concerning the perceived usefulness of the extracted tags. The baseline system was then extended to include new functionality to help set limits on the scope of relevant content, extract Named Entities, use of predefined dictionaries to identify even lowscoring hobbies and interests, and use of machine translation to handle content in multiple languages.
Domaines
Informatique et langage [cs.CL]
Origine : Fichiers produits par l'(les) auteur(s)
Loading...