A Link Structure Based Website Topic Hierarchy Extracting Approach
- DOI
- 10.2991/jcis.2008.98How to use a DOI?
- Keywords
- Link Structure; Website Topic Hierarchy; Weighted Directed Graph
- Abstract
Visualizing hierarchy of a website is very helpful for both users’ navigating and search engine efficiently presenting results. In this paper, treating webpages as nodes and hyperlinks as directed edges, the link structure is firstly modeled as weighted directed graph. Considering multiple website features, which include directory path, contents and anchor texts etc.,the weight is determined by semantic relevance between webpages. The single source shortest path algorithm is finally applied to extract the Topic hierarchy. Conducted experiment on real web to evaluate the proposed algorithm shows the proposed method gets an average pre-cision gain of 11.67% than baseline method.
- Copyright
- © 2008, the Authors. Published by Atlantis Press.
- Open Access
- This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).
Cite this article
TY - CONF AU - Zhao Xu AU - Qingcai Chen AU - Hongzhi Guo PY - 2008/12 DA - 2008/12 TI - A Link Structure Based Website Topic Hierarchy Extracting Approach BT - Proceedings of the 11th Joint Conference on Information Sciences (JCIS 2008) PB - Atlantis Press SP - 584 EP - 589 SN - 1951-6851 UR - https://doi.org/10.2991/jcis.2008.98 DO - 10.2991/jcis.2008.98 ID - Xu2008/12 ER -