Research on the Small Files Problem of Hadoop
Authors
Xiaojun Liu, Chong Peng, Zhichao Yu
Corresponding Author
Xiaojun Liu
Available Online January 2015.
- DOI
- 10.2991/emcs-15.2015.9How to use a DOI?
- Keywords
- Hadoop Distributed File System (HDFS); Small Files Problem; Hadoop Archives; Sequence les; RDBMS
- Abstract
Although Hadoop is widely used, its full potential is not yet put to use because of some issues, the small les problem being one of them. Firstly, the paper analyses the causes of the small les problem of Hadoop. Then, the current program to solve the small les problem are introduced, including Hadoop own programs and other application-specific solutions, and analyzes the advantages and disadvantages of various options. Finally, we present two research ideas, one is to use a combination of RDBMS and Hadoop; Another is to make the “Datanode” caching some metadata of the small files.
- Copyright
- © 2015, the Authors. Published by Atlantis Press.
- Open Access
- This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).
Cite this article
TY - CONF AU - Xiaojun Liu AU - Chong Peng AU - Zhichao Yu PY - 2015/01 DA - 2015/01 TI - Research on the Small Files Problem of Hadoop BT - Proceedings of the International Conference on Education, Management, Commerce and Society PB - Atlantis Press SP - 39 EP - 43 SN - 2352-5398 UR - https://doi.org/10.2991/emcs-15.2015.9 DO - 10.2991/emcs-15.2015.9 ID - Liu2015/01 ER -