Skip to Main content Skip to Navigation
New interface
Conference papers

Tweetaneuse : Fouille de motifs en caractères et plongement lexical à l’assaut du deft 2017

Davide Buscaldi 1 Aude Grezka 1 Gaël Lejeune 2 
LINA - Laboratoire d'Informatique de Nantes Atlantique
Abstract : This articles describes the methods developed by the TWEETANEUSE team for the 2017 edition of the French text mining challenge (DEFT 2017). This year the challenge was dedicated to tweet classification : polarity detection and figurative language detection. The first method we designed relies on character-level patterns used as features for training a One VS Rest classifier. These patterns can be described as "frequent closed patterns without gap" in the sense of the data mining community, according to the text algorithmics community they are called maximal repeated strings. The two other methods use 13 features computed with lexical resources (FEEL, LabMT and a resource of our own). For one of these methods we added a bag of word representation of the tweets while for the other one a word embeddings representation has been added. The character-level method produced the best results in particular for the second task : figurative tweets detection.
Complete list of metadata
Contributor : Aude Grezka Connect in order to contact the contributor
Submitted on : Wednesday, February 2, 2022 - 11:45:11 AM
Last modification on : Wednesday, April 27, 2022 - 3:54:47 AM
Long-term archiving on: : Tuesday, May 3, 2022 - 6:52:49 PM


Publication funded by an institution


  • HAL Id : hal-02362125, version 1


Davide Buscaldi, Aude Grezka, Gaël Lejeune. Tweetaneuse : Fouille de motifs en caractères et plongement lexical à l’assaut du deft 2017. 24e Conférence sur le Traitement Automatique des Langues Naturelles (TALN) : Analyse d'opinion et langage figuratif dans des tweets, Jun 2017, Orléans, France. pp. 65-76. ⟨hal-02362125⟩



Record views


Files downloads