An Exploratory Study of Tweets about the SARS-CoV-2 Omicron Variant: Insights from Sentiment Analysis, Language Interpretation, Source Tracking, Type Classification, and Embedded URL Detection - Archive ouverte HAL Accéder directement au contenu
Article Dans Une Revue COVID Année : 2022

An Exploratory Study of Tweets about the SARS-CoV-2 Omicron Variant: Insights from Sentiment Analysis, Language Interpretation, Source Tracking, Type Classification, and Embedded URL Detection

Résumé

This paper presents the findings of an exploratory study on the continuously generating Big Data on Twitter related to the sharing of information, news, views, opinions, ideas, knowledge, feedback, and experiences about the COVID-19 pandemic, with a specific focus on the Omicron variant, which is the globally dominant variant of SARS-CoV-2 at this time. A total of 12,028 tweets about the Omicron variant were studied, and the specific characteristics of the tweets that were analyzed include sentiment, language, source, type, and embedded URLs. The findings of this study are manifold. First, from sentiment analysis, it was observed that 50.5% of tweets had a ‘neutral’ emotion. The other emotions—‘bad’, ‘good’, ‘terrible’, and ‘great’—were found in 15.6%, 14.0%, 12.5%, and 7.5% of the tweets, respectively. Second, the findings of language interpretation showed that 65.9% of the tweets were posted in English. It was followed by Spanish or Castillian, French, Italian, Japanese, and other languages, which were found in 10.5%, 5.1%, 3.3%, 2.5%, and <2% of the tweets, respectively. Third, the findings from source tracking showed that “Twitter for Android” was associated with 35.2% of tweets. It was followed by “Twitter Web App”, “Twitter for iPhone”, “Twitter for iPad”, “TweetDeck”, and all other sources that accounted for 29.2%, 25.8%, 3.8%, 1.6%, and <1% of the tweets, respectively. Fourth, studying the type of tweets revealed that retweets accounted for 60.8% of the tweets, it was followed by original tweets and replies that accounted for 19.8% and 19.4% of the tweets, respectively. Fifth, in terms of embedded URL analysis, the most common domain embedded in the tweets was found to be twitter.com, which was followed by biorxiv.org, nature.com, wapo.st, nzherald.co.nz, recvprofits.com, science.org, and other domains. Finally, to support research and development in this field, we have developed an open-access Twitter dataset that comprises Tweet IDs of more than 500,000 tweets about the Omicron variant, posted on Twitter since the first detected case of this variant on 24 November 2021.

Dates et versions

hal-03748759 , version 1 (10-08-2022)

Identifiants

Citer

Nirmalya Thakur, Chia Han. An Exploratory Study of Tweets about the SARS-CoV-2 Omicron Variant: Insights from Sentiment Analysis, Language Interpretation, Source Tracking, Type Classification, and Embedded URL Detection. COVID, 2022, 2 (8), pp.1026-1049. ⟨10.3390/covid2080076⟩. ⟨hal-03748759⟩

Collections

TDS-MACS
132 Consultations
0 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More