Leveraging Time for Spammers Detection on Twitter

Mahdi Washha; Aziz Qaroush; Florence Sèdes

doi:10.1145/3012071.3012078

Communication Dans Un Congrès Année : 2016

Leveraging Time for Spammers Detection on Twitter

(1) , (2) , (1)

1
2

Mahdi Washha

Fonction : Auteur

Systèmes d’Informations Généralisées

Aziz Qaroush

Fonction : Auteur

Birzeit University

Florence Sèdes

Fonction : Auteur
PersonId : 735498
IdHAL : florence-sedes
ORCID : 0000-0002-9273-302X
IdRef : 033232679

Systèmes d’Informations Généralisées

Résumé

Twitter is one of the most popular microblogging social systems, which provides a set of distinctive posting services operating in real time. The flexibility of these services has attracted unethical individuals, so-called "spammers", aiming at spreading malicious, phishing, and misleading information. Unfortunately, the existence of spam results non-ignorable problems related to search and user's privacy. In the battle of fighting spam, various detection methods have been designed, which work by automating the detection process using the "features" concept combined with machine learning methods. However, the existing features are not effective enough to adapt spammers' tactics due to the ease of manipulation in the features. Also, the graph features are not suitable for Twitter based applications, though the high performance obtainable when applying such features. In this paper, beyond the simple statistical features such as number of hashtags and number of URLs, we examine the time property through advancing the design of some features used in the literature, and proposing new time based features. The new design of features is divided between robust advanced statistical features incorporating explicitly the time attribute, and behavioral features identifying any posting behavior pattern. The experimental results show that the new form of features is able to classify correctly the majority of spammers with an accuracy higher than 93% when using Random Forest learning algorithm, applied on a collected and annotated data-set. The results obtained outperform the accuracy of the state of the art features by about 6%, proving the significance of leveraging time in detecting spam accounts.

Mots clés

Spam Legitimate users Machine learning Honeypot Time

Domaines

Informatique [cs]

Documentation IRIT : Connectez-vous pour contacter le contributeur

https://hal.science/hal-03159077

Soumis le : jeudi 4 mars 2021-11:54:05

Dernière modification le : lundi 20 novembre 2023-11:44:23

Dates et versions

hal-03159077 , version 1 (04-03-2021)

Identifiants

HAL Id : hal-03159077 , version 1
DOI : 10.1145/3012071.3012078

Citer

Mahdi Washha, Aziz Qaroush, Florence Sèdes. Leveraging Time for Spammers Detection on Twitter. 8th International Conference on Management of Emergent Digital EcoSystems (MEDES 2016), Nov 2016, Hendaye, France. pp.109--116, ⟨10.1145/3012071.3012078⟩. ⟨hal-03159077⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-TLSE2 CNRS SMS UT1-CAPITOLE IRIT IRIT-SIG IRIT-GD IRIT-UT3 TOULOUSE-INP UNIV-UT3 UT3-TOULOUSEINP

21 Consultations

0 Téléchargements

Leveraging Time for Spammers Detection on Twitter

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager