Research on Web Character Information Extraction Based on Semantic Similarity
- DOI
- 10.2991/ceie-16.2017.85How to use a DOI?
- Keywords
- Semantic Similarity; Character Information Extraction; Machine Learning
- Abstract
As for the loss of the comprehensiveness from the large amount of data when extracting information, this paper proposes a method of character information extraction based on semantic similarity algorithm to improve the comprehensiveness of the character information extraction of massive data in the network. The algorithm is put into the semantic tree to choose the synonyms of the word, and the character feature set which is extended by semantic similarity is applied to character information extraction. The results show that the recall reaches to 81.87% in the case of the accuracy rate being basically unchanged. Therefore, this method of character information extraction is obviously improving in comprehensiveness, and it can be used in network data.
- Copyright
- © 2017, the Authors. Published by Atlantis Press.
- Open Access
- This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).
Cite this article
TY - CONF AU - Baocheng Wang AU - Wei Huang AU - Zhongren Li AU - Ke Xiao PY - 2016/10 DA - 2016/10 TI - Research on Web Character Information Extraction Based on Semantic Similarity BT - Proceedings of the International Conference on Communication and Electronic Information Engineering (CEIE 2016) PB - Atlantis Press SP - 663 EP - 670 SN - 2352-5401 UR - https://doi.org/10.2991/ceie-16.2017.85 DO - 10.2991/ceie-16.2017.85 ID - Wang2016/10 ER -