Research on Document Content Classification on Mathematical Regression Model
Authors
Hua Long, Baoan Li
Corresponding Author
Hua Long
Available Online October 2015.
- DOI
- 10.2991/icmii-15.2015.120How to use a DOI?
- Keywords
- Document Classification; SVM (Support Vector Machine); CHI (Chi-square Statistic); Mathematical Regression Model
- Abstract
To improve the document classification problem, this study proposes a classification algorithm based on mathematical regression model, making Chinese document classification get rid of the dependence on traditional dictionary method. The method of extracting high frequency keywords, establishes the appropriate matrix model, making a high-dimensional document change into a low-dimensional document, and then use mathematical regression model to give a comprehensive feature weighting function by corpus training. It explored an approach to avoid the traditional method of the problem of curse of dimensionality.
- Copyright
- © 2015, the Authors. Published by Atlantis Press.
- Open Access
- This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).
Cite this article
TY - CONF AU - Hua Long AU - Baoan Li PY - 2015/10 DA - 2015/10 TI - Research on Document Content Classification on Mathematical Regression Model BT - Proceedings of the 3rd International Conference on Mechatronics and Industrial Informatics PB - Atlantis Press SP - 695 EP - 698 SN - 2352-538X UR - https://doi.org/10.2991/icmii-15.2015.120 DO - 10.2991/icmii-15.2015.120 ID - Long2015/10 ER -