Assessing COVID-19 Vaccine Hesitancy via Sentiment Analysis: A Fine-Tuned Naive Bayes Classifier using Feature Engineering

Abstract—The introduction of COVID-19 vaccines was met with scepticism and fear, amplified through social media plat- forms such as Twitter. This work investigates public sentiment regarding vaccines by creating a new, more accurate classifier for classifying COVID-19-related tweets by employing transfer learning of a pre-trained model, fine tuning the model with COVID-19-related data, and feature engineering to consider the impact of favourites and retweets information to a tweet’s sentiment. A COVID-19 dataset was hand-labelled to measure the accuracy of the model. Additionally, this paper examines the sentiment and applies topic modelling to analyse underlying categories in tweets about the COVID vaccines. The results indicate an almost 90 percent accuracy for the new classifier which was then used to analyse 40,000 tweets. The sentiment analysis yields a more positive outlook towards vaccines over- time, and topic modelling yields 5 main categories including vaccine availability information, vaccine administration, vac- cine hesitancy, COVID cases, and discussion of vaccine studies.


To combat the threat of COVID-19, many pharmaceu- tical companies created their own vaccines and received increasing public attention.[1] Although vaccines are vital for prevention of diseases, there is considerable scepticism regarding their development and administration.[2] As such, public sentiment has become a major impediment to rapid immunisation.[3] Moreover, many people’s primary source of information on health and vaccination is social media—particularly Twitter, a platform that allows users to voice their thoughts about COVID-19 issues.[4] Therefore, there is an urgent need to investigate public sentiment on Twitter to learn in which direction public perception about COVID-19 vaccinations is headed on social media to better understand public views, fears, and attitudes that may influence immunity goals.

Natural Language Processing (NLP) models are useful to analyse public sentiment from social media forums.[5] Sentiment analysis is used for understanding whether a piece of writing is positive, negative or neutral, and many pre-trained models already exist for this task..[6] However, to make accurate predictions, the model needs to be trained on COVID-19-related data. Transfer learning allows us to take a pre-trained model of a task and use it for others.[7] Transfer learning has been very helpful in many NLP tasks as it reduces the need for annotated data.

The pre-trained Textblob classifier was used for sentiment analysis and adapted for the task of labelling COVID-19- related tweets via fine tuning.[8] In fine-tuning, the weights are kept trainable and are fine-tuned for the target task. Thus, the pre-trained model acts as a starting point for the model leading to faster convergence compared to the random initialisation.[9] Additionally, tweets on the Twitter platform contain parameters such as number of retweets and favourites on the tweets which can be used for providing the model more context in training. For this, feature engineering, a technique where additional features are included to make the model better, was applied.[10]

In addition to sentiment analysis, topic modelling has been used to statistically determine the abstract topics that occur in the tweets.[11] In the context of finding what the public believes, it is important to take a look at the different discussion points or themes people tweet. By including sen- timent and the main points of discussion, a comprehensive understanding of the public views regarding the COVID vaccines can be obtained.


Sentiment analysis inspects the given text and classifies the writer’s attitude as positive or negative.[12] This paper presents a classifier that classifies an unlabelled COVID- 19-related tweet as conveying either positive or negative sentiment based on that tweet’s text, number of favourites, and number of retweets. This is achieved by fine tuning the Naive Bayes Textblob classifier which is originally trained on a movie reviews corpus.[13] This paper also analyses changes in public sentiment with regards to COVID-19 vaccine related issues using the classifier and applies Latent Dirichlet Allocation (LDA) topic modelling to further ex- tract the underlying themes in public perception of COVID- 19 vaccines.

Data Extraction and Preprocessing

A total of 125,906 tweets, from December 12, 2020 through August 12, 2021, were obtained using a large dataset available on Kaggle.[14] These tweets were col- lected using a query for tweets containing the following keywords: Pfizer/BioNTech, Sinopharm, Sinovac, Moderna, Oxford/AstraZeneca, Covaxin, Sputnik V. These keywords

represent some of the most popular vaccines around the world. The dataset included the following metadata: tweet text content, number of retweets, and number of favourites.

The COVID-19 tweets were pre-processed such that all unnecessary information including user ID tags, html tags, links, punctuations, and stop words were removed.[15] Figure 1 shows a summary of the procedure. Subsequently, the NLTK library was used to stem each sentence before determining its polarity using the Textblob classifier.

P (c|x) = P (x1|c) × P (x2|c) × . . . × P (xn|c) × P (c)

where P(c|x) is the posterior probability of class given x, P(c) is the prior probability of class, P(x|c) is the likelihood, and P(x) is the prior probability of the predictor.

To improve detection of positive and negative sentiments using the Naive Bayes classifier, the constructed labelled dataset was then used to further train the model.

2.3. Feature Engineering with Additional Tweet Data

Feature engineering is the process of using domain knowledge of the data to create features that would make the model better by providing additional inputs.[17] This requires focusing on the data and creating better data which helps the model classify it better and provide reasonable re- sults. The favourite and retweet information were considered to further enhance the classifier.

Twitter users often favourite or retweet tweets they like. As such, there is an observable correlation between the number of favourites/retweets and the sentiment of the tweet.[18] This additional feature was provided to the classi- fier in an attempt to provide more context for training on COVID-19 tweets. To this end, a second labelled dataset of 1320 tweets that included information about the number of favourites and retweets for each tweet was created. Figure 2 represents the inclusion of this feature in the dataset where the sum of favourites and retweets is concatenated to the text.

Figure 1. An example of pre-processing steps for each Tweet before classification.

Each tweet was classified as positive or negative by the classifier, or it was given a polarity of zero—in which case only the tweets with non-zero polarity were kept to understand in which direction the public sentiment is more biased and how the trend has changed over time. The pro- cess yielded 40,227 tweets classified as positive or negative sentiment.

2.2. Fine Tuning Classifier with COVID-19 Tweets

Ten randomly selected tweets from each day of data available were labelled as positive or negative for a total of 1320 hand-labelled tweets. Tweets considered neutral were not labelled as they were not used. The labels were then compared to Textblob’s original classification which yielded a baseline accuracy of 50.38 percent.

Textblob’s Naive Bayes classifier was employed to fine tune the model. Naive Bayesian Classification is both a supervised learning and statistical classification method.[16] It is a conditional probability model that applies Bayes’ theorem with strong independence assumptions between features. The idea is to classify text based on the posterior probability of the documents belonging to the different classes. The formula is represented in the equation below.

P (c x) = P (x|c)P (c)

P (x)

Figure 2. An example of including the sum of favourites and likes to the tweet text before training the classifier with this data.

The second labelled dataset was also used in training Naive Bayes models. The additional feature was an attempt to provide the model with more contextual information to better classify COVID-19-related tweets.

Opinion Mining

After training the classifier, it was used to fit the rest of the data and reclassify sentiments. The results were further analysed to give insights into public sentiment regarding the vaccines.

Topic Modelling

Topic modelling is used to extract different themes that appear in a large collection of documents. Topic modelling is employed in this paper as it assigns a document to a set of topics with varying weights without making any assump- tions about the distance between topics, providing more realistic results than hard clustering. The Latent Dirichlet

Allocation (LDA) model, the most popular topic model, is applied to determine the major categories of the tweet data.[19]

This paper applies the LDA algorithm using the Sci- kit learn library, which provides a range of supervised and unsupervised learning algorithms in Python.[20] The algorithm requires an input of a number of topics; after testing on topic numbers from 2 through 30, 20 was chosen as it did not leave out major topics as well as not repeat the same themes multiple times. Figure 3 below shows an example of related words grouped as a topic and their respective weights.

Figure 3. An example of related words grouped as a topic by the LDA algorithm and their respective weights.


The Textblob classifier without training on the labelled dataset yielded an accuracy of 50.38 percent. After training on the labelled text dataset, the accuracy jumped to 72.46 percent. Furthermore, training on labelled data that included the number of favourites and retweets boosted accuracy to

89.86 percent.

The new, more accurate classifier was then used to label the entire dataset of 40,227 COVID-19 tweets. The frequency of positive and negative tweets were compared by finding the number of tweets labelled positive or negative for a given day and dividing them by the total positive or negative number of tweets respectfully. Figure 4 and 5 show the trend of positive and negative sentiment over time. The two graphs were compared side by side and yielded a correlation of 0.586.

Figure 4. Positive tweets frequency as classified by the enhanced classifier

The topic modelling yielded 20 topics which were cat- egorised in 5 major categories as shown in Figure 6 below.

Figure 5. Negative tweets frequency as classified by the enhanced classifier

The category of vaccine information mainly included discussion about the availability of vaccines. These types of tweets generally shared information regarding location and type of vaccine available to the public. The second type of category, vaccine administration, consisted of people announcing that they got vaccinated. Such tweets were generally positive and encouraged others to get the vaccine as well. The third type was vaccine hesitancy. These tweets expressed concern were generally negative and complained about extreme allergic re- actions and fever. The fourth type was general discussion of COVID-19 cases. These tweets were generally negative and mournful about rising cases in specific countries. Finally, the fifth type was a discussion about vaccine studies. These tweets were more sophisticated and discussed the efficacy and clinical trials of different COVID vaccines.

Figure 6. Results of LDA Topic Modelling sorted into five major categories


The aim of this paper was to investigate public senti- ment to learn in which direction public perception about how COVID-19 vaccinations is headed. The challenge was

that the initial accuracy from the Textblob classifier was just 50 percent, which is as good as tossing a coin. Fine tuning and feature engineering were successful techniques that helped increase the accuracy of the classifier from 50 percent to almost 90 percent. Using retweets and favourites in addition to tweet text proved useful in classifying tweets accurately—and this could be expanded further in areas of sentiment analysis outside of COVID vaccine data.

The new classifier is able to provide information about public sentiment and aid in further research. For example, the classification shows a dramatic increase in positive sentiment after widespread vaccination was announced in March 2021.[21] At the same time, there is a lesser in- crease in negative sentiment which could be attributed to vaccine hesitancy. The results also revealed that there were more positive tweets overall. This could be attributed to the classifier having an easier time detecting positive tweets as tweets with a lot of favourites/retweets tend to convey a positive sentiment.

COVID-19 tweets were analysed by the Latent Dirich- let allocation (LDA) model. These were grouped into 5 categories which illustrate the main concerns people had. The public’s concern for rising COVID cases and vaccine hesitancy are shows in the terms. It is observed that India is the country mentioned the most. This could be potentially explained by India’s rapid increase in COVID cases during April.[22] Furthermore, “India” is in many tweets contain- ing “covaxin” as it is an Indian developed vaccine. Topics discussing people getting their second shot are mixed in with gratitude and encouragement for others to get the vaccine. In general, tweets with positive sentiment offer encouragement for getting the vaccine.


This paper develops an enhanced classifier for detect- ing public sentiment in COVID-19-related tweets based on a tweet’s text, number of favourites, and number of retweets by fine-tuning and feature engineering Textblob’s pre-trained Naive Bayes classifier and achieving nearly 90 percent accuracy. The positive and negative sentiments of over 40,000 tweets are examined, over a period of eight months, to learn in which direction public perception about COVID-19 vaccinations has trended over time. Additionally, a hand-labelled dataset is created to measure the accuracy of the model. Furthermore, the major 5 categories related to COVID-19 vaccine were identified by applying LDA. The novel classifier and dataset can be utilised in public research on COVID-19 and Natural language research regarding the public sentiment on COVID-19.

Future work in this area could involve using other social media platforms, analysing emotion in addition to sentiment, training on different kinds of NLP models, and training the sentiment analysis model on more data to work better with tweets provided with the number of favourites and retweets.


Thanks to Emaan Hariri, Graduate Student Researcher at University of California Berkeley, for guiding this research.


  1. Le, Tung Thanh, Jakob P. Cramer, Robert Chen, and Stephen May- hew. ”Evolution of the COVID-19 vaccine development landscape.” Nat Rev Drug Discov 19, no. 10 (2020): 667-668.
  2. Stobbe, Mike, and Hannah Fingerhut. “AP-NORC Poll: A Third of US Adults Skeptical of COVID SHOTS.” AP NEWS. Associated Press, February 10, 2021. skeptical-vaccine-3779574a6d45d38cfc1d8615eb176b2d.
  3. Shearer, Elisa, and Amy Mitchell. “News Use across Social Media Platforms in 2020.” Pew Research Center’s Journalism Project. Pew Research Center, June 4, 2021. across-social-media-platforms-in-2020/.
  4. Mitchell, Amy, and Jacob Liedke. “About Four-in-Ten Americans Say Social Media Is an Important Way of FOLLOWING COVID-

19 VACCINE NEWS.” Pew Research Center. Pew Research Center, August 24, 2021. tank/2021/08/24/about-four-in-ten-americans-say-social-media-is-


  1. Hussain A, Tahir A, Hussain Z, Sheikh Z, Gogate M, Dashtipour K, et al. “Artificial Intelligence–Enabled Analysis of Public Attitudes on Facebook and Twitter toward COVID-19 Vaccines in the United Kingdom and the United States: Observational Study.” Journal of Medical Internet Research. JMIR Publications Inc., Toronto, Canada, April 5, 2021.
  2. Terry-Jack, Mohammed. “Nlp: Pre-Trained Senti- ment Analysis.” Medium. Medium, May 2, 2019. 1eb52a9d742c.
  3. Ruder, Sebastian, Matthew E. Peters, Swabha Swayamdipta, and Thomas Wolf. ”Transfer learning in natural language processing.” In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Tutorials, pp. 15- 18. 2019.
  4. Loria, Steven. ”textblob Documentation.” Release 0.15 2 (2018): 269.
  5. Bhavsar, Pratik. “Transfer Learning in NLP.” Medium. Modern NLP, June 26, 2021. nlp-f5035cc3f62f.
  6. Bocca, F. F., Rodrigues, L. H. A. (2016). The effect of tuning, feature engineering, and feature selection in data mining applied to rainfed sugarcane yield modelling. Computers and Electronics in Agriculture, 128, 67–76. doi:10.1016/j.compag.2016.08.015
  7. Cho, Hae-Wol. “Topic Modeling.” Osong public health and research perspectives. Korea Centers for Disease Control and Prevention, June 2019.
  8. Honchar, Alex. “Sentiment Analysis : Solutions and Applica- tions Survey.” Medium. High Performance Analytics, February 13, 2021. and-applications-survey-9e52d3ea2ac7.
  9. Loria, Steven. “Advanced Usage: Overriding Models and the Blobber Class¶.” Advanced Usage: Overriding Models and the Blobber Class

– TextBlob 0.16.0 documentation, 2020. https://textblob.readthedocs. io/en/dev/advanced usage.html#sentiment-analyzers.

  1. Preda G. (2021, March). COVID-19 All Vaccines Tweets. Re- trieved July 1, 2021 from covid19-vaccines-tweets/activity
  2. Hemalatha, I., GP Saradhi Varma, and A. Govardhan. ”Preprocessing the informal text for efficient sentiment analysis.” International Jour- nal of Emerging Trends Technology in Computer Science (IJETTCS) 1, no. 2 (2012): 58-61.
  3. Rish, Irina. ”An empirical study of the naive Bayes classifier.” In IJCAI 2001 workshop on empirical methods in artificial intelligence, vol. 3, no. 22, pp. 41-46. 2001.
  4. Shekhar, Amit. “What Is Feature Engineering for Machine Learning?” Medium. MindOrks, December 6, 2019. machine-learning-d8ba3158d97a.
  5. Samuel, J., Myles, R. Kashyap, R., (2019). That Message Went Viral?! Exploratory Analytics and Sentiment Analysis into the Propa- gation of Tweets. In 2019 Annual Proceedings of Northeast Decision Sciences Institute (NEDSI) Conference, Philadelphia, USA.
  6. Blei, D.M.; Ng, A.Y.; Jordan, M.I. Latent dirichlet allocation. J. Mach.

Learn. Res. 2003, 3, 993–1022

  1. Pedregosa, Fabian. “Scikit-Learn: Machine Learning in Python.” Journal of Machine Learning Research, October 12, 2011.
  2. Wise, Alana. “Biden Says U.S. Will Have Vaccine Supply for All Adults by May, Prioritizes Teachers.” NPR. NPR, March 2, 2021. updates/2021/03/02/973030394/biden-says-u-s-will-have-vaccine- supply-for-all-adults-by-may-prioritizes-teache.
  3. Mallapaty, Smriti. “India’s Massive COVID Surge Puzzles Sci- entists.” Nature News. Nature Publishing Group, April 21, 2021.

Leave a Comment

Your email address will not be published. Required fields are marked *