Research on Modern Chinese Multi-category Words Part of Speech Tagging Based on Hidden Markov Model
- DOI
- 10.2991/meic-14.2014.88How to use a DOI?
- Keywords
- Computer systems ;Chinese information processing; Multi-category words; Part of speech tagging; Hidden Markov Model
- Abstract
In recent years, computer systems are widely used in the modern Chinese part of speech tagging. Modern Chinese part of speech tagging is a basic subject in the natural language processing. It is widely used in machine translation, natural language understanding, establishing of the Chinese corpus, information retrieval, text classification, text proofreading and speech recognition, among others. In the part of speech tagging, multi-category words part of speech (POS) tagging is always a difficulty. Although the total number of multi-category words in the modern Chinese is not high, the usage is fairly widespread. This paper, proposes an algorithm of multi-category words part of speech tagging. First, it is word segmentation according to the traditional method. And then, on this basis, we introduce a method based on the rules of multi-category words part of speech tagging. Finally, a detailed description of the Hidden Markov Model (HMM) used in the words part of speech tagging, and a statistical algorithm based on Hidden Markov Model.
- Copyright
- © 2014, the Authors. Published by Atlantis Press.
- Open Access
- This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).
Cite this article
TY - CONF AU - Zhendong Song PY - 2014/11 DA - 2014/11 TI - Research on Modern Chinese Multi-category Words Part of Speech Tagging Based on Hidden Markov Model BT - Proceedings of the 2014 International Conference on Mechatronics, Electronic, Industrial and Control Engineering PB - Atlantis Press SP - 393 EP - 397 SN - 2352-5401 UR - https://doi.org/10.2991/meic-14.2014.88 DO - 10.2991/meic-14.2014.88 ID - Song2014/11 ER -