Affinity Propagation Clustering Algorithm based on Spark Platform

Lijia Zhang; Lianglun Cheng

doi:10.2991/wartia-16.2016.107

<Previous Article In Volume

Next Article In Volume>

Affinity Propagation Clustering Algorithm based on Spark Platform

Authors

Lijia Zhang, Lianglun Cheng

Corresponding Author

Lijia Zhang

Available Online May 2016.

DOI: 10.2991/wartia-16.2016.107 How to use a DOI?
Keywords: Affinity propagation, Resilient Distributed Datasets, Spark, Large scale dataset.
Abstract: With the explosive growing of data, there are challenges to deal with the large scale complex data. Many clustering algorithms have been proposed. Such as Affinity Propagation (AP) clustering Algorithm, AP takes similarity between pairs of data point as input measures. AP is a fast and efficient clustering algorithm for large dataset compared with the existing clustering algorithm. As the scale of data grows more explosively, the time efficiency of AP algorithm cannot be satisfied. Therefore, AP clustering algorithm based on Spark platform (Spark-AP) is proposed in this paper. Firstly, a dataset is partitioned into several Resilient Distributed Datasets (RDD) on a strategy and select the exemplars of each RDD. Then exemplars are merged and are used to next AP clustering algorithm, which forms a set of high-quality exemplars after convergence. Experiments show that Spark-AP performs better both in processing scale and processing time.
Copyright: © 2016, the Authors. Published by Atlantis Press.
Open Access: This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Download article (PDF)

<Previous Article In Volume

Next Article In Volume>

Volume Title: Proceedings of the 2016 2nd Workshop on Advanced Research and Technology in Industry Applications
Series: Advances in Engineering Research
Publication Date: May 2016
ISBN: 978-94-6252-195-7
ISSN: 2352-5401
DOI: 10.2991/wartia-16.2016.107 How to use a DOI?
Open Access: This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Cite this article

ris enw bib

TY  - CONF
AU  - Lijia Zhang
AU  - Lianglun Cheng
PY  - 2016/05
DA  - 2016/05
TI  - Affinity Propagation Clustering Algorithm based on Spark Platform
BT  - Proceedings of the 2016 2nd Workshop on Advanced Research and Technology in Industry Applications
PB  - Atlantis Press
SP  - 530
EP  - 533
SN  - 2352-5401
UR  - https://doi.org/10.2991/wartia-16.2016.107
DO  - 10.2991/wartia-16.2016.107
ID  - Zhang2016/05
ER  -

download .riscopy to clipboard