Analysis of Simple Data Imputation in Disease Dataset
- DOI
- 10.2991/icst-18.2018.98How to use a DOI?
- Keywords
- analysis; simple Imputation; disease dataset; fuzzy c-means
- Abstract
In the statistical data collection it is very possible that there are variables that do not respond or in other words empty, called missing value, that can cause problems in data analysis. In this research we will analyze some simple imputation technique to solve the missing value problem, are zero imputation, mean imputation median imputation, and random imputation. This study used a Pima Indians, hepatitis and breast cancer Wisconsin dataset from UCI Machine Learning. We also compare with incomplete data removal technique. The application of various simple imputations in the disease dataset can increase the accuracy value when compared to deficient data deletion techniques. And the zero imputation technique shows the best performance compared to other imputation techniques and deficient data removal techniques.
- Copyright
- © 2018, the Authors. Published by Atlantis Press.
- Open Access
- This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).
Cite this article
TY - CONF AU - Fetty Tri Anggraeny AU - Intan Yuniar Purbasari AU - M. Syahrul Munir AU - Faisal Muttaqin AU - Eka Prakarsa Mandyarta AU - Fawwaz Ali Akbar PY - 2018/12 DA - 2018/12 TI - Analysis of Simple Data Imputation in Disease Dataset BT - Proceedings of the International Conference on Science and Technology (ICST 2018) PB - Atlantis Press SP - 471 EP - 475 SN - 2589-4943 UR - https://doi.org/10.2991/icst-18.2018.98 DO - 10.2991/icst-18.2018.98 ID - Anggraeny2018/12 ER -