Efficient DNA Sequences Storage Scheme based on HBase
- DOI
- 10.2991/mecae-18.2018.122How to use a DOI?
- Keywords
- HBase,DNA sequences,DNA division,file index
- Abstract
In view of the characteristics of large amount of biological sequences data, fast growth rate and high sequences repeatability, a set of DNA sequences storage schemes based on HBase is proposed and implemented in combination with the related theory and technology of HBase distributed database. The pre-splitting strategy and Rowkey optimization based on DNA classification code is designed, which solves the problem of balanced load and hot spots of server.The efficient access of data is implemented by constructing the file index to replace the specific sequences. Experiments demonstrate that the DNA sequences storage system designed by these schemes has good storage capacity and scalability.
- Copyright
- © 2018, the Authors. Published by Atlantis Press.
- Open Access
- This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).
Cite this article
TY - CONF AU - Shaoxiong Wen PY - 2018/03 DA - 2018/03 TI - Efficient DNA Sequences Storage Scheme based on HBase BT - Proceedings of the 2018 International Conference on Mechanical, Electronic, Control and Automation Engineering (MECAE 2018) PB - Atlantis Press SP - 334 EP - 337 SN - 2352-5401 UR - https://doi.org/10.2991/mecae-18.2018.122 DO - 10.2991/mecae-18.2018.122 ID - Wen2018/03 ER -