Text Classification Model Based on Document Matrix Convolutional Neural Networks
- DOI
- 10.2991/caai-17.2017.115How to use a DOI?
- Keywords
- natural language processing; convolutional neural networks; document matrix; text classification; word embedding
- Abstract
This paper proposes a new text classification method named document matrix convolutional neural networks (DM-CNN). Unlike the original ways of converting texts into 1-dimensional vectors and processing each word as a pixel, this model is based on the n-dimensional word embeddings obtained in advance, taking each entry of the word embedding as a pixel, and converting the text into a 2-dimensional document matrix (DM) according to the method proposed in this paper, maintaining the order of the words in the original, so that the DM-CNN model can process the text as if it were an image. The model greatly maintains the information content of the more abstract words, as well as the structural information at all levels in the original. In the experiment, the DM-CNN text classification model is compared to the classifiers based on classical machine learning algorithms, and the results show the feasibility and superiority of DM-CNN.
- Copyright
- © 2017, the Authors. Published by Atlantis Press.
- Open Access
- This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).
Cite this article
TY - CONF AU - Xuemiao Zhang AU - Weizhong Qian AU - Zhaoyi Liu AU - Xin He PY - 2017/06 DA - 2017/06 TI - Text Classification Model Based on Document Matrix Convolutional Neural Networks BT - Proceedings of the 2017 2nd International Conference on Control, Automation and Artificial Intelligence (CAAI 2017) PB - Atlantis Press SP - 512 EP - 517 SN - 1951-6851 UR - https://doi.org/10.2991/caai-17.2017.115 DO - 10.2991/caai-17.2017.115 ID - Zhang2017/06 ER -