A Survey of Web Page Preprocessing Research
Authors
Qi Qi, Gui-Xian Xu
Corresponding Author
Qi Qi
Available Online December 2016.
- DOI
- 10.2991/icwcsn-16.2017.118How to use a DOI?
- Keywords
- Web page cleaning; data mining; Web mining; information retrieval.
- Abstract
After obtaining the required information through the crawler technology on Web, it also includes a lot of advertisement and navigation bar. So we should take the basic method to remove the noise content on Web page, which is independent of topic, it is necessary to sum up the Web denoising and do a further study. Firstly, we should explain why the page denosing is necessary, define the page denoising, and summarize the method of Web page denosing, Secondly, we should the improve the algorithm on the Web page denoising, Finally we should discuss the webpage denoising problems and the future research direction.
- Copyright
- © 2017, the Authors. Published by Atlantis Press.
- Open Access
- This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).
Cite this article
TY - CONF AU - Qi Qi AU - Gui-Xian Xu PY - 2016/12 DA - 2016/12 TI - A Survey of Web Page Preprocessing Research BT - Proceedings of the 3rd International Conference on Wireless Communication and Sensor Networks (WCSN 2016) PB - Atlantis Press SP - 585 EP - 588 SN - 2352-538X UR - https://doi.org/10.2991/icwcsn-16.2017.118 DO - 10.2991/icwcsn-16.2017.118 ID - Qi2016/12 ER -