Impact of Various Data Splitting Ratios on the Performance of Machine Learning Models in the Classification of Lung Cancer
- DOI
- 10.2991/978-94-6463-252-1_12How to use a DOI?
- Keywords
- Lung cancer (LC); Artificial Neural Network (ANN); Data Splitting Ratios
- Abstract
Owing to revolutionary technological advancements and exceptional experimental data, particularly in the area of image analysis and processing, artificial intelligence (AI) and Machine Learning has lately become widely popular buzzword. This opportunity has been taken by medical specialties where imaging is essential, such as radiology, pathology, or cancer, and significant research and development efforts have been made to translate the promise of AI and ML into therapeutic applications. As these tools are increasingly being used for common medical imaging analytic tasks including diagnosis, segmentation, and classification. The four classifiers Artificial Neural Network (ANN), Support Vector Machine (SVM), Naïve Bayes (NB), and K Nearest Neighbour (KNN) are used in this study to classify lung cancer based on the features that are extracted from lung segmentation Algorithm. The feature data is estimated from 90 image sets and are combined for normalization and divided into training, validation, and testing sets with a ratio of 80:10:10. Different ratios (i.e., 80/20, 70/30, 60/40, 50/50) were used to divide the datasets into the training and the testing datasets to assess the model performance. ANN and KNN were very precise in achieving an accuracy of 99.8% with moderate and high training data.
- Copyright
- © 2023 The Author(s)
- Open Access
- Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
Cite this article
TY - CONF AU - Archana Nazarkar AU - Harish Kuchulakanti AU - Chandra Sekhar Paidimarry AU - Sravya Kulkarni PY - 2023 DA - 2023/11/09 TI - Impact of Various Data Splitting Ratios on the Performance of Machine Learning Models in the Classification of Lung Cancer BT - Proceedings of the Second International Conference on Emerging Trends in Engineering (ICETE 2023) PB - Atlantis Press SP - 96 EP - 104 SN - 2352-5401 UR - https://doi.org/10.2991/978-94-6463-252-1_12 DO - 10.2991/978-94-6463-252-1_12 ID - Nazarkar2023 ER -