Skip to Main content Skip to Navigation
Journal articles

Behavioural account-based features for filtering out social spammers in large-scale twitter data collections

Mahdi Washha 1 Manel Mezghani 1 Florence Sèdes 1
1 IRIT-SIG - Systèmes d’Informations Généralisées
IRIT - Institut de recherche en informatique de Toulouse
Abstract : Online social networks (OSNs) have become an important source of information for a tremendous range of applications and researches. However, the high usability and accessibility of OSNs have exposed many information quality (IQ) problems which consequently decrease the performance of OSNs dependent applications. Social spammers are a particular kind of ill-intentioned users who degrade the quality of OSNs information through misusing all possible services provided by OSNs. Given the fact that Twitter is not immune towards the social spam problem, different researchers have designed various detection methods of a spam content. Ho-wever, the tweet-based detection methods are not effective for detecting a spam content because of the dynamicity and the fast evolution of spam. Moreover, the robust account-based features are costly for extraction because of the need for huge volume of data from Twitter’s servers, while most other account-based features don’t model the behavior of social spammers. Hence, in this paper, we introduce a design of new 10 robust behavioral account-based features for filte-ring out spam accounts existing in large-scale Twitter "crawled" data collections. Our features focus on modeling the behavior of social spammers, such as the time correlation among tweets. The experimental results show that our new behavioral features are able to correctly classify the majority of social spammers (spam accounts), outperforming 75 account-based features de-signed in the literature.
Document type :
Journal articles
Complete list of metadatas

Cited literature [47 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-02548073
Contributor : Open Archive Toulouse Archive Ouverte (oatao) <>
Submitted on : Monday, April 20, 2020 - 2:34:43 PM
Last modification on : Tuesday, September 8, 2020 - 10:24:03 AM

File

Washha_22241.pdf
Files produced by the author(s)

Identifiers

Citation

Mahdi Washha, Manel Mezghani, Florence Sèdes. Behavioural account-based features for filtering out social spammers in large-scale twitter data collections. Revue des Sciences et Technologies de l'Information - Série ISI : Ingénierie des Systèmes d'Information, Lavoisier, 2017, 22 (3), pp.65-88. ⟨10.3166/ISI.22.3.65-88⟩. ⟨hal-02548073⟩

Share

Metrics

Record views

40

Files downloads

78