Sentiment Analysis of IMDb Movie Reviews Using SVM and Naive Bayes Classifier
DOI:
https://doi.org/10.3126/jes2.v4i1.70138Keywords:
IMDb Movie Review , Naive Bayes, CNN-SVM, TF-IDFAbstract
Sentiment analysis is a powerful tool for understanding public opinion, especially in the entertainment industry. Opinion in the form of text reviews plays a significant role in the success of a movie. Text-based data analysis is extensively used to recognize opinion sentiments. Achieving the proper sentiment for classification is crucial for both consumers and organizations. Handling large and complex data can pose more challenges during classification. This quantitative research is focused on classifying the sentiments of the IMDb movie review dataset using supervised Machine Learning (ML) models such as Naive Bayes (NB) and Support Vector Machines (SVM). The sentiments were classified as positive and negative to identify best-fit models for the large-scale review classification. 50,000 IMDb movie reviews went through preprocessing and feature extraction to transform unprocessed text input into numerical form, deploying Term Frequency-Inverse Document Frequency (TF-IDF). Eventually, the split between positive and negative ratings was even distributed. SVM and NB models were trained and assessed on various train-test splits to ensure robust model evaluation. Precision, Recall, and F1 Score were performance metrics applied to calculate the efficiency of models. Based on the report, the SVM model outperformed Naive Bayes regarding accuracy. SVM achieved an average accuracy of 88%, while Naive Bayes achieved 85%. This research can significantly aid filmmakers in understanding viewer preferences, which is crucial for market strategy and content creation.
Keywords: IMDb Movie Review; Naive Bayes; SVM; TF-IDF
Downloads
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 The Author(s)

This work is licensed under a Creative Commons Attribution 4.0 International License.
CC BY: This license allows reusers to distribute, remix, adapt, and build upon the material in any medium or format, so long as attribution is given to the creator. The license allows for commercial use.