A Chinese Word Clustering Method Using Latent Dirichlet Allocation and K-means
Authors
Lin Qiu, Jungang Xu
Corresponding Author
Lin Qiu
Available Online July 2013.
- DOI
- 10.2991/cse.2013.60How to use a DOI?
- Keywords
- word clustering; latent dirichlet allocation; k-means; word similarity
- Abstract
Word clustering is a popular research issue in the field of natural language processing. In this paper, Latent Dirichlet Allocation algorithm is used to extract the topics from nouns in the text, and the highest probability noun of each topic is selected as the centroids of the k-means algorithm. Experimental results show that this method can get better effects than the graph-based word clustering algorithms using a web search engine.
- Copyright
- © 2013, the Authors. Published by Atlantis Press.
- Open Access
- This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).
Cite this article
TY - CONF AU - Lin Qiu AU - Jungang Xu PY - 2013/07 DA - 2013/07 TI - A Chinese Word Clustering Method Using Latent Dirichlet Allocation and K-means BT - Proceedings of the 2nd International Conference on Advances in Computer Science and Engineering (CSE 2013) PB - Atlantis Press SP - 269 EP - 272 SN - 1951-6851 UR - https://doi.org/10.2991/cse.2013.60 DO - 10.2991/cse.2013.60 ID - Qiu2013/07 ER -