Improvements and Implementation of Hierarchical Clustering based on Hadoop
- DOI
- 10.2991/icmmita-15.2015.236How to use a DOI?
- Keywords
- Hierarchical Clustering; Hadoop; MapReduce
- Abstract
As the traditional agglomerative hierarchical clustering has a higher number of iterations which makes low efficiency of parallel realization on Hadoop, we propose an improved hierarchical clustering method: when the between-class distance is monotonically increasing, by changing the clustering order of hierarchical clustering without changing the final clustering result, aggregate multiple classes in a MapReduce operation, to reduce the number of iterations then enhance the computational efficiency. The experiments show compared to traditional hierarchical clustering algorithm implemented in Hadoop, the improved algorithm implemented in Hadoop has greatly reduces the number of iterations and the computation time.
- Copyright
- © 2015, the Authors. Published by Atlantis Press.
- Open Access
- This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).
Cite this article
TY - CONF AU - Jun Zhang AU - Chunxiao Fan AU - Yuexin Wu AU - Ao Xiao PY - 2015/11 DA - 2015/11 TI - Improvements and Implementation of Hierarchical Clustering based on Hadoop BT - Proceedings of the 2015 3rd International Conference on Machinery, Materials and Information Technology Applications PB - Atlantis Press SP - 1279 EP - 1284 SN - 2352-538X UR - https://doi.org/10.2991/icmmita-15.2015.236 DO - 10.2991/icmmita-15.2015.236 ID - Zhang2015/11 ER -