A Feature Weight Algorithm for Text Classification Based on Class Information
- DOI
- 10.2991/iccia.2012.226How to use a DOI?
- Keywords
- text classification, feature weight, inverse class frequency, term frequency in class, document frequency in class
- Abstract
TFIDF algorithm was used for feature weighting in text classification. But the result of classification was not very well because of lack of class information in feature weighting. The known class information in the training set was used to improve the traditional TFIDF feature weight algorithm. Class distinction ability and class description ability were introduced, respectively expressed by inverse class frequency and term frequency in class, document frequency in class. A new feature weight algorithm based on class information, TF_IDT, was proposed. Naïve Bayes classifier was used to test the algorithm. The precision, recall and F1 measure were significantly increased. Macro F1 measure raise by 6.46%. It was proved to be useful for improving text classification to use class information in feature weighting. In addition, the computational complexity of the proposed algorithm was lower and more suitable for use in fields of limited computing capability.
- Copyright
- © 2013, the Authors. Published by Atlantis Press.
- Open Access
- This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).
Cite this article
TY - CONF AU - Yongfei Li PY - 2014/05 DA - 2014/05 TI - A Feature Weight Algorithm for Text Classification Based on Class Information BT - Proceedings of the 2012 2nd International Conference on Computer and Information Application (ICCIA 2012) PB - Atlantis Press SP - 930 EP - 932 SN - 1951-6851 UR - https://doi.org/10.2991/iccia.2012.226 DO - 10.2991/iccia.2012.226 ID - Li2014/05 ER -