Proceedings of the 2018 3rd International Conference on Automation, Mechanical Control and Computational Engineering (AMCCE 2018)

Domain Topic and Hidden Deep Web Data Extracting

Authors
Liming Du, Abdulhamid Yahaya, Gui Li, Fengying Wang, Jie Dong
Corresponding Author
Liming Du
Available Online May 2018.
DOI
10.2991/amcce-18.2018.151How to use a DOI?
Keywords
Web Data; Data extraction; Data Mining
Abstract

This paper mainly studies the method of extracting web data entities based on domain. Through the analysis of real estate industry websites, a topic-oriented topic extracting model is proposed, and the corresponding search strategy is given. In addition, for the case of depth information, a sorting-based classification extraction algorithm is designed for numerical data. Finally, an experimental example is given to verify the effectiveness of the algorithm.

Copyright
© 2018, the Authors. Published by Atlantis Press.
Open Access
This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Download article (PDF)

Volume Title
Proceedings of the 2018 3rd International Conference on Automation, Mechanical Control and Computational Engineering (AMCCE 2018)
Series
Advances in Engineering Research
Publication Date
May 2018
ISBN
978-94-6252-508-5
ISSN
2352-5401
DOI
10.2991/amcce-18.2018.151How to use a DOI?
Copyright
© 2018, the Authors. Published by Atlantis Press.
Open Access
This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Cite this article

TY  - CONF
AU  - Liming Du
AU  - Abdulhamid Yahaya
AU  - Gui Li
AU  - Fengying Wang
AU  - Jie Dong
PY  - 2018/05
DA  - 2018/05
TI  - Domain Topic and Hidden Deep Web Data Extracting
BT  - Proceedings of the 2018 3rd International Conference on Automation, Mechanical Control and Computational Engineering (AMCCE 2018)
PB  - Atlantis Press
SP  - 862
EP  - 866
SN  - 2352-5401
UR  - https://doi.org/10.2991/amcce-18.2018.151
DO  - 10.2991/amcce-18.2018.151
ID  - Du2018/05
ER  -