Identification of Disease-Associated Combination of SNPs Using a Hybrid Algorithm with Multiple Encoding Approaches
- DOI
- 10.2991/ammsa-17.2017.85How to use a DOI?
- Keywords
- GWAS; SNPs; feature selection; UMDA; SVM
- Abstract
Background: Individual SNP often only exhibit a small effect, but combinations of SNPs are assumed to be strongly influence the risk of disease. Obviously, selecting an optimal subset of SNPs, which most associated with disease, is a NP-hard problem. Results: To obtain a higher performance of predicting power for disease status and a higher computing efficiency, we proposed a double-filter-wrapper (DFW) algorithm to identify the optimal subset of SNPs. Moreover, few studies have been carried out to solve the SNPs encoding issues. On the basis of the differences of statistical properties between case and control, three types of encoding methods were proposed to generate the input for the DFW. Conclusion: We used five complex disease datasets to verify the effectiveness of our algorithm. The experimental results showed that our method appears more promising than other current methods for identifying the associated SNPs. In addition, the results also indicate that the encoding method proposed in this paper can much more accurately reflect the real situation.
- Copyright
- © 2017, the Authors. Published by Atlantis Press.
- Open Access
- This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).
Cite this article
TY - CONF AU - Jing Zhao AU - Bin Wei PY - 2017/05 DA - 2017/05 TI - Identification of Disease-Associated Combination of SNPs Using a Hybrid Algorithm with Multiple Encoding Approaches BT - Proceedings of the 2017 International Conference on Applied Mathematics, Modelling and Statistics Application (AMMSA 2017) PB - Atlantis Press SP - 378 EP - 382 SN - 1951-6851 UR - https://doi.org/10.2991/ammsa-17.2017.85 DO - 10.2991/ammsa-17.2017.85 ID - Zhao2017/05 ER -