Title: Detecting the risk of COVID-19 spread in near real-time using social media
Authors: Mohammed Ahsan Raza Noori; Bharti Sharma; Ritika Mehra
Addresses: School of Computing, DIT University, Dehradun, 248009, Uttarakhand, India ' School of Computing, DIT University, Dehradun, 248009, Uttarakhand, India ' School of Computer Science and Engineering, Dev Bhoomi Uttarakhand University, Dehradun, 248007, Uttarakhand, India
Abstract: COVID-19 is a contagious disease caused by SARS-CoV-2, and WHO recommended preventive measures like social distancing, testing, lockdowns, face masks, etc. to limit its spread. Failure to implement and monitor these measures increases the risk of spread and mortality rates. In this paper, a near real-time system using Twitter for detecting the risk of COVID-19 spread is proposed. The system uses Apache Spark framework for text mining, machine learning, and near real-time processing of data from Twitter. Five base machine learning classifiers: support vector machine (SVM), logistic regression (LR), multilayer perceptron (MLP), decision tree (DT), and Naive Bayes (NB) are combined to form an ensemble majority voting classifier (EMVC). Results show that the EMVC achieved an accuracy of 94.76%. Then, the proposed system is tested in real-time for detecting tweets related to the risk of COVID-19 spread in London, Mumbai, and New York in June 2020.
Keywords: COVID-19; coronavirus; risk detection; social media; Twitter; machine learning; ensemble learning; near real-time system; Apache Spark.
International Journal of Emergency Management, 2023 Vol.18 No.2, pp.202 - 223
Received: 15 Feb 2021
Accepted: 10 Mar 2022
Published online: 05 Jul 2023 *