Cloud Computing K-Means Text Clustering Filtering Algorithm based on Hadoop

Suyu Huang

doi:10.2991/icmmita-16.2016.278

<Previous Article In Volume

Next Article In Volume>

Cloud Computing K-Means Text Clustering Filtering Algorithm based on Hadoop

Authors

Suyu Huang

Corresponding Author

Suyu Huang

Available Online January 2017.

DOI: 10.2991/icmmita-16.2016.278 How to use a DOI?
Keywords: clustering; K average; text; cloud computing; big data; filtering
Abstract: the partition and hierarchy methods are the most popular clustering technology of the clustering algorithm. Providing that the k-means is sensitive to the initial clustering center and is likely to become partially optimal, an advanced clustering algorithm based on the partial swarm is presented in this essay through determining the number of clusters and the initial clustering center dynamically with the method shown in Literature [1] combined with the method of Literature [2], so as to optimize the normalization of sample set, weight adjustment of particle swarm, computation of dissimilarity matrix and colony fitness variance. Through this algorithm, the initial clustering center is determined through the density and the max/min distance to eliminate k-means being sensitive to the initial value and partially optimal. The colony fitness variance is introduced through normalization of the dimension properties of sampling set to work out the further optimized hybrid algorithm. According to the test results, this algorithm is featured with higher accuracy and stronger convergence ability.
Copyright: © 2017, the Authors. Published by Atlantis Press.
Open Access: This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Download article (PDF)

<Previous Article In Volume

Next Article In Volume>

Volume Title: Proceedings of the 2016 4th International Conference on Machinery, Materials and Information Technology Applications
Series: Advances in Computer Science Research
Publication Date: January 2017
ISBN: 978-94-6252-285-5
ISSN: 2352-538X
DOI: 10.2991/icmmita-16.2016.278 How to use a DOI?
Open Access: This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Cite this article

ris enw bib

TY  - CONF
AU  - Suyu Huang
PY  - 2017/01
DA  - 2017/01
TI  - Cloud Computing K-Means Text Clustering Filtering Algorithm based on Hadoop
BT  - Proceedings of the 2016 4th International Conference on Machinery, Materials and Information Technology Applications
PB  - Atlantis Press
SP  - 1209
EP  - 1214
SN  - 2352-538X
UR  - https://doi.org/10.2991/icmmita-16.2016.278
DO  - 10.2991/icmmita-16.2016.278
ID  - Huang2017/01
ER  -

download .riscopy to clipboard