Proceedings of the 2nd International Seminar on Science and Technology (ISSTEC 2019)

K-Means Clustering Optimization Using the Elbow Method and Early Centroid Determination Based on Mean and Median Formula

Authors
Edy Umargono, Jatmiko Endro Suseno, S.K Vincensius Gunawan
Corresponding Author
Edy Umargono
Available Online 11 October 2020.
DOI
10.2991/assehr.k.201010.019How to use a DOI?
Keywords
clustering, optimization, elbow method, mean data
Abstract

The most widely used algorithm in the cluster partitioning method is the K-Means algorithm, K-Means is an iteration algorithm with the user determining the number of clusters that need to be grouped and determining the centroid for each cluster so that the level of similarity between members in one group is high while the level of similarity with members in other groups is very low. Historically K-Means is still the best grouping algorithm among other grouping algorithms with the ability to group a number of data with relatively fast and efficient computing time. The K-Means algorithm is widely implemented in various fields in industrial and scientific applications and is very suitable for processing quantitative data with numeric attributes, but there are still weaknesses in this algorithm. Weaknesses of the K-Means algorithm include determining the number of clusters based on assumptions and relying heavily on the initial selection of centroids to overcome this weakness, in this study, we propose the use of the elbow method to determine the best number of clusters and initials. Centroid determination based on average and median data. The results of this study indicate that using initial cluster center determination based on average data makes the number of iterations needed to achieve uniformity in clusters 22.58% less than initial random cluster determination and determining the best number of clusters using the elbow method makes the required iteration 25% less than using the number of other clusters.

Copyright
© 2020, the Authors. Published by Atlantis Press.
Open Access
This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Download article (PDF)

Volume Title
Proceedings of the 2nd International Seminar on Science and Technology (ISSTEC 2019)
Series
Advances in Social Science, Education and Humanities Research
Publication Date
11 October 2020
ISBN
978-94-6239-168-0
ISSN
2352-5398
DOI
10.2991/assehr.k.201010.019How to use a DOI?
Copyright
© 2020, the Authors. Published by Atlantis Press.
Open Access
This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Cite this article

TY  - CONF
AU  - Edy Umargono
AU  - Jatmiko Endro Suseno
AU  - S.K Vincensius Gunawan
PY  - 2020
DA  - 2020/10/11
TI  - K-Means Clustering Optimization Using the Elbow Method and Early Centroid Determination Based on Mean and Median Formula
BT  - Proceedings of the 2nd International Seminar on Science and Technology (ISSTEC 2019)
PB  - Atlantis Press
SP  - 121
EP  - 129
SN  - 2352-5398
UR  - https://doi.org/10.2991/assehr.k.201010.019
DO  - 10.2991/assehr.k.201010.019
ID  - Umargono2020
ER  -