Effect of Distance Metrics in Determining K-Value in K-Means Clustering Using Elbow and Silhouette Method
- DOI
- 10.2991/aisr.k.200424.051How to use a DOI?
- Keywords
- clustering, Partitional-based clustering, K-means, elbow and silhouette
- Abstract
Clustering is one of the main task in datamining. It is useful to group and cluster the data. There are a few ways to cluster the data such as partitional-based, hierarchical-based and density based. Partitional-based clustering is a way to cluster data with non-overlapping subsets. One of the most popular partitional-based clustering algorithm is K-means. K-means is an algorithm to cluster data in to K cluster and based their distance to its centroid. Due to the pational, a few factors that must be determined before using K-means is the value of K. Determining the value of K is a big problem because there is no universal way to find the value of K. Two popular ways to determine the value of K is using elbow and silhouette method. This method is graph based. But before using this method another factor is important to determine and that is the metrics distance that will be used. This paper will show the effect of three distance metric Manhattan, Euclidian and Minkowski in finding the value of K using elbow and silhouette method. Based on this study the choice of distance matrix used has little impact in determining the value of K in K-means using elbow and silhouette. Manhattan distance has the most variant in the elbow and silhouette graph. Elbow method is difficult to use and sometimes it is unable to define the value of K in K- means based on its graph.
- Copyright
- © 2020, the Authors. Published by Atlantis Press.
- Open Access
- This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).
Cite this article
TY - CONF AU - Danny Matthew SAPUTRA AU - Daniel SAPUTRA AU - Liniyanti D. OSWARI PY - 2020 DA - 2020/05/06 TI - Effect of Distance Metrics in Determining K-Value in K-Means Clustering Using Elbow and Silhouette Method BT - Proceedings of the Sriwijaya International Conference on Information Technology and Its Applications (SICONIAN 2019) PB - Atlantis Press SP - 341 EP - 346 SN - 1951-6851 UR - https://doi.org/10.2991/aisr.k.200424.051 DO - 10.2991/aisr.k.200424.051 ID - SAPUTRA2020 ER -