Proceedings of the Sriwijaya International Conference on Information Technology and Its Applications (SICONIAN 2019)

Effect of Distance Metrics in Determining K-Value in K-Means Clustering Using Elbow and Silhouette Method

Authors
Danny Matthew SAPUTRA, Daniel SAPUTRA, Liniyanti D. OSWARI
Corresponding Author
Danny Matthew SAPUTRA
Available Online 6 May 2020.
DOI
10.2991/aisr.k.200424.051How to use a DOI?
Keywords
clustering, Partitional-based clustering, K-means, elbow and silhouette
Abstract

Clustering is one of the main task in datamining. It is useful to group and cluster the data. There are a few ways to cluster the data such as partitional-based, hierarchical-based and density based. Partitional-based clustering is a way to cluster data with non-overlapping subsets. One of the most popular partitional-based clustering algorithm is K-means. K-means is an algorithm to cluster data in to K cluster and based their distance to its centroid. Due to the pational, a few factors that must be determined before using K-means is the value of K. Determining the value of K is a big problem because there is no universal way to find the value of K. Two popular ways to determine the value of K is using elbow and silhouette method. This method is graph based. But before using this method another factor is important to determine and that is the metrics distance that will be used. This paper will show the effect of three distance metric Manhattan, Euclidian and Minkowski in finding the value of K using elbow and silhouette method. Based on this study the choice of distance matrix used has little impact in determining the value of K in K-means using elbow and silhouette. Manhattan distance has the most variant in the elbow and silhouette graph. Elbow method is difficult to use and sometimes it is unable to define the value of K in K- means based on its graph.

Copyright
© 2020, the Authors. Published by Atlantis Press.
Open Access
This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Download article (PDF)

Volume Title
Proceedings of the Sriwijaya International Conference on Information Technology and Its Applications (SICONIAN 2019)
Series
Advances in Intelligent Systems Research
Publication Date
6 May 2020
ISBN
978-94-6252-963-2
ISSN
1951-6851
DOI
10.2991/aisr.k.200424.051How to use a DOI?
Copyright
© 2020, the Authors. Published by Atlantis Press.
Open Access
This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Cite this article

TY  - CONF
AU  - Danny Matthew SAPUTRA
AU  - Daniel SAPUTRA
AU  - Liniyanti D. OSWARI
PY  - 2020
DA  - 2020/05/06
TI  - Effect of Distance Metrics in Determining K-Value in K-Means Clustering Using Elbow and Silhouette Method
BT  - Proceedings of the Sriwijaya International Conference on Information Technology and Its Applications (SICONIAN 2019)
PB  - Atlantis Press
SP  - 341
EP  - 346
SN  - 1951-6851
UR  - https://doi.org/10.2991/aisr.k.200424.051
DO  - 10.2991/aisr.k.200424.051
ID  - SAPUTRA2020
ER  -