Character information extraction based on CRFsuite
- DOI
- 10.2991/aest-16.2016.19How to use a DOI?
- Keywords
- CRFsuite; information extraction; machine learning.
- Abstract
By applying the Conditional Random Fields based on discriminant undirected graph to character information extraction, this paper proposes an automation character information extraction method based on CRFsuite. Through learning the known domain, this method extracts the feature leading words, position and means from the character information in the Internet to build up a character parameter. By using CRFsuite as a model, the method adopts it to data from the Internet, matches character information and builds up the structured character information database. The method proposed by this paper demonstrates the feasibility of the implement of automation extraction of character information in the mass Internet data, and provides an effective way to facilitate character information tracking and looking-up.
- Copyright
- © 2016, the Authors. Published by Atlantis Press.
- Open Access
- This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).
Cite this article
TY - CONF AU - Jingzhong Wang AU - Zhongren Li AU - Wei Huang AU - Ke Xiao PY - 2016/11 DA - 2016/11 TI - Character information extraction based on CRFsuite BT - Proceedings of the 2016 International Conference on Advanced Electronic Science and Technology (AEST 2016) PB - Atlantis Press SP - 147 EP - 154 SN - 1951-6851 UR - https://doi.org/10.2991/aest-16.2016.19 DO - 10.2991/aest-16.2016.19 ID - Wang2016/11 ER -