Objectivity and Subjectivity Classification with BERT for Bahasa Melayu
- DOI
- 10.2991/978-94-6463-094-7_20How to use a DOI?
- Keywords
- Objectivity; Word2Vec; Subjectivity classification; BERT; Sentiment classification
- Abstract
This research present the notion of subjectivity and objectivity in Bahasa Melayu language. Word2Vec and BERT word embedding models are created for the purpose of subjectivity classification and sentiment classification. Two types of embeddings are developed (Word2Vec and BERT) with Wikipedia data as objectivity dataset, Twitter data as subjectivity dataset and combination of both datasets. A pre-trained BERT embedding model called Bert-Base-Bahasa-Cased is used as a reference. First, the datasets are fed into every embedding model to be embedded as vectors. The subjectivity classification and sentiment classification are carried out via 70:30 train-test splits. Both classification tasks are carried out using Logistic Regression, Random Forest, and Double Layer Neural Network classifiers. Logistic Regression on Bert-Base-Bahasa-Cased model achieved the highest result of 99.95% in subjectivity classification and 74.30% in sentiment classification.
- Copyright
- © 2022 The Author(s)
- Open Access
- Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
Cite this article
TY - CONF AU - Wing Kin Chong AU - Hu Ng AU - Timothy Tzen Vun Yap AU - Wooi King Soo AU - Vik Tor Goh AU - Dong Theng Cher PY - 2022 DA - 2022/12/27 TI - Objectivity and Subjectivity Classification with BERT for Bahasa Melayu BT - Proceedings of the International Conference on Computer, Information Technology and Intelligent Computing (CITIC 2022) PB - Atlantis Press SP - 246 EP - 257 SN - 2589-4900 UR - https://doi.org/10.2991/978-94-6463-094-7_20 DO - 10.2991/978-94-6463-094-7_20 ID - Chong2022 ER -