A Similarity-Based Method for Entity Coreference Resolution in Big Data Environment
- DOI
- 10.2991/amitp-16.2016.22How to use a DOI?
- Keywords
- big data, entity coreference resolution, similarity, MapReduce.
- Abstract
Processing and analyzing large scale data is needed in the big data environment, however, a large number of duplicate data refer to the same entity in the data set have brought great difficulties to analyze and process the acquired data. The method based on cluster analysis is one of the main methods of entity coreference resolution, but it is time-consuming and does not apply to big data environment. This paper presents a similarity-based method for entity coreference resolution by introducing weight and similarity and using Hadoop platform and MapReduce framework, which will process data into the form of key-value data pairs and can be efficiently applied to the entity coreference resolution. Experiments show that the proposed method greatly improves the speed and accuracy of entity coreference resolution, meets the demand for entity coreference resolution in big data environment.
- Copyright
- © 2016, the Authors. Published by Atlantis Press.
- Open Access
- This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).
Cite this article
TY - CONF AU - Yushui Geng AU - Peng Li AU - Jing Zhao PY - 2016/09 DA - 2016/09 TI - A Similarity-Based Method for Entity Coreference Resolution in Big Data Environment BT - Proceedings of the 2016 4th International Conference on Advanced Materials and Information Technology Processing (AMITP 2016) PB - Atlantis Press SP - 110 EP - 116 SN - 2352-538X UR - https://doi.org/10.2991/amitp-16.2016.22 DO - 10.2991/amitp-16.2016.22 ID - Geng2016/09 ER -