Sentiment Analysis on Internet Movie Database (IMDb) Movie Review Dataset: Hyperparameters Tuning for Naïve Bayes Model
- DOI
- 10.2991/978-94-6463-300-9_73How to use a DOI?
- Keywords
- Sentiment Classification; Machine Learning; Naive Bayes Classifier; Hyperparameter Tuning
- Abstract
Sentiment classification plays a crucial role in understanding and analyzing text data, particularly in domains like social media and online reviews. In this study, the influence of three key parameters on the accuracy of sentiment classification was investigated by applying Naive Bayes classifier to the Internet Movie Database (IMDb) movie review dataset. To explore the impact of the training set ratio, the proportion of data allocated to the training set was varied while keeping other parameters constant. Results indicate that increasing the training set ratio from 50% to 90% leads to a gradual improvement in classification accuracy. This finding suggests that a larger training set provides more representative samples for learning, enhancing the model’s ability to generalize. Subsequently, the impact of the maximum features parameter—which establishes the feature space’s dimensionality—was investigated. By changing the number of features taken into account, it is found that a larger value of max features, like 4096, produces better accuracy. Additionally, the impact of the smoothing parameter alpha on classification accuracy was investigated. The experiments showed that different alpha values, such as 0.1, 0.5, and 1, had minimal influence on the accuracy. This suggests that the Naive Bayes classifier is relatively robust to variations in the smoothing parameter in the context of sentiment classification. The findings emphasize the significance of a larger training set and an optimal number of features for improving accuracy, meanwhile the influence of the smoothing parameter appears to be limited in this context.
- Copyright
- © 2023 The Author(s)
- Open Access
- Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
Cite this article
TY - CONF AU - Haoran Li PY - 2023 DA - 2023/11/27 TI - Sentiment Analysis on Internet Movie Database (IMDb) Movie Review Dataset: Hyperparameters Tuning for Naïve Bayes Model BT - Proceedings of the 2023 International Conference on Image, Algorithms and Artificial Intelligence (ICIAAI 2023) PB - Atlantis Press SP - 693 EP - 699 SN - 2352-538X UR - https://doi.org/10.2991/978-94-6463-300-9_73 DO - 10.2991/978-94-6463-300-9_73 ID - Li2023 ER -