The Research of Chinese Short-text Classification Based on Domain Keyword Set Extension and HowNet

Xiangdong Li; Fan Gao; Cong Ding

doi:10.2991/icca-16.2016.57

<Previous Article In Volume

Next Article In Volume>

The Research of Chinese Short-text Classification Based on Domain Keyword Set Extension and HowNet

Authors

Xiangdong Li, Fan Gao, Cong Ding

Corresponding Author

Xiangdong Li

Available Online January 2016.

DOI: 10.2991/icca-16.2016.57 How to use a DOI?
Keywords: Short-text classification, Keyword set, LDA, Feature extension, HowNet
Abstract: To implement feature extension of short text and improve short text classification performance, this paper extracts the high frequency words and topic core words of each class of the training set as domain keyword set based on two different feature granularity, which are keyword and latent topic, and derives the topic probability distribution of the test text using LDA model, while some topic probability is greater than a certain threshold, extends the keywords of the topic into the testing text. Calculate the semantic similarity of the test text and the domain keyword set for each category by using HowNet. Experimental results show that the method proposed in this paper can effectively improve the short-text classification performance.
Copyright: © 2016, the Authors. Published by Atlantis Press.
Open Access: This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Download article (PDF)

<Previous Article In Volume

Next Article In Volume>

Volume Title: Proceedings of the 2016 International Conference on Intelligent Control and Computer Application
Series: Advances in Computer Science Research
Publication Date: January 2016
ISBN: 978-94-6252-154-4
ISSN: 2352-538X
DOI: 10.2991/icca-16.2016.57 How to use a DOI?
Open Access: This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Cite this article

ris enw bib

TY  - CONF
AU  - Xiangdong Li
AU  - Fan Gao
AU  - Cong Ding
PY  - 2016/01
DA  - 2016/01
TI  - The Research of Chinese Short-text Classification Based on Domain Keyword Set Extension and HowNet
BT  - Proceedings of the 2016 International Conference on Intelligent Control and Computer Application
PB  - Atlantis Press
SP  - 244
EP  - 247
SN  - 2352-538X
UR  - https://doi.org/10.2991/icca-16.2016.57
DO  - 10.2991/icca-16.2016.57
ID  - Li2016/01
ER  -

download .riscopy to clipboard