Design and Implementation of Digital Library Retrieval System Based on Hadoop
- DOI
- 10.2991/icsshe-17.2017.75How to use a DOI?
- Keywords
- Hadoop, library retrieval distributed computing, Lucene
- Abstract
In order to obtain the information, the system has been designed based on Hadoop as our experimental platform. It adopts HDFS distributed storage system to improve reliability, fault tolerance and scalability. To reduce retrieval latency, our system implements the distributed computing framework MapReduce, where the Map function maps the data processing task to multiple nodes, and the Reduce function aggregates the processing result of each node into one node, To achieve high-performance retrieval, full text information retrieval framework Lucene has also been adopted. Lucene are able to build unified information resource index, and sort the retrieved data resources by relevance to ensure accurate retrieval. Moreover, to improve the user experience, our system provides friendly interfaces to the user query and display through JSP based web designing. When handling the amount of data is small, distributed multi-node run will have longer execution time rather than that of a single node. The experimental result s show that, our system is able to handle massive data and provide real-time and accurate results to help users make quick decision.
- Copyright
- © 2017, the Authors. Published by Atlantis Press.
- Open Access
- This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).
Cite this article
TY - CONF AU - Daying Wang PY - 2017/09 DA - 2017/09 TI - Design and Implementation of Digital Library Retrieval System Based on Hadoop BT - Proceedings of the 2017 3rd International Conference on Social Science and Higher Education PB - Atlantis Press SP - 299 EP - 302 SN - 2352-5398 UR - https://doi.org/10.2991/icsshe-17.2017.75 DO - 10.2991/icsshe-17.2017.75 ID - Wang2017/09 ER -