Proceedings of the 2023 International Conference on Image, Algorithms and Artificial Intelligence (ICIAAI 2023)

Sentiment Analysis on Internet Movie Database (IMDb) Movie Review Dataset: Hyperparameters Tuning for Naïve Bayes Model

Authors
Haoran Li1, *
1Department of Material Science and Engineering, Tokyo Institute of Technology, Yokohama, 226-8503, Japan
*Corresponding author. Email: 1912231219@mail.sit.edu.cn
Corresponding Author
Haoran Li
Available Online 27 November 2023.
DOI
10.2991/978-94-6463-300-9_73How to use a DOI?
Keywords
Sentiment Classification; Machine Learning; Naive Bayes Classifier; Hyperparameter Tuning
Abstract

Sentiment classification plays a crucial role in understanding and analyzing text data, particularly in domains like social media and online reviews. In this study, the influence of three key parameters on the accuracy of sentiment classification was investigated by applying Naive Bayes classifier to the Internet Movie Database (IMDb) movie review dataset. To explore the impact of the training set ratio, the proportion of data allocated to the training set was varied while keeping other parameters constant. Results indicate that increasing the training set ratio from 50% to 90% leads to a gradual improvement in classification accuracy. This finding suggests that a larger training set provides more representative samples for learning, enhancing the model’s ability to generalize. Subsequently, the impact of the maximum features parameter—which establishes the feature space’s dimensionality—was investigated. By changing the number of features taken into account, it is found that a larger value of max features, like 4096, produces better accuracy. Additionally, the impact of the smoothing parameter alpha on classification accuracy was investigated. The experiments showed that different alpha values, such as 0.1, 0.5, and 1, had minimal influence on the accuracy. This suggests that the Naive Bayes classifier is relatively robust to variations in the smoothing parameter in the context of sentiment classification. The findings emphasize the significance of a larger training set and an optimal number of features for improving accuracy, meanwhile the influence of the smoothing parameter appears to be limited in this context.

Copyright
© 2023 The Author(s)
Open Access
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Download article (PDF)

Volume Title
Proceedings of the 2023 International Conference on Image, Algorithms and Artificial Intelligence (ICIAAI 2023)
Series
Advances in Computer Science Research
Publication Date
27 November 2023
ISBN
978-94-6463-300-9
ISSN
2352-538X
DOI
10.2991/978-94-6463-300-9_73How to use a DOI?
Copyright
© 2023 The Author(s)
Open Access
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Cite this article

TY  - CONF
AU  - Haoran Li
PY  - 2023
DA  - 2023/11/27
TI  - Sentiment Analysis on Internet Movie Database (IMDb) Movie Review Dataset: Hyperparameters Tuning for Naïve Bayes Model
BT  - Proceedings of the 2023 International Conference on Image, Algorithms and Artificial Intelligence (ICIAAI 2023)
PB  - Atlantis Press
SP  - 693
EP  - 699
SN  - 2352-538X
UR  - https://doi.org/10.2991/978-94-6463-300-9_73
DO  - 10.2991/978-94-6463-300-9_73
ID  - Li2023
ER  -