Aadhaar Data Analysis Comparison in MapReduce, Hive and Spark
- DOI
- 10.2991/ahis.k.210913.036How to use a DOI?
- Keywords
- Aadhaar, Big data, MapReduce, Hadoop, Hive, Apache Spark
- Abstract
Aadhaar with a 12-digit unique identification number of every Indian provides demographic and biometric information and is mandatory for various purposes like benefit transfer directly, healthcare, etc. Approximately Aadhaar details need to store 1.3 Billion Indians which attributes to the concept of big data. In this paper, the proposed hybrid model analyses the Aadhaar dataset w.r.t different research interrogations such as count of applicants based on gender, state-wise approved and by age type applicants. In the existing systems, Aadhaar data analyses are done either manually or in primitive SQL platforms which may take days to complete. In this paper, the focus is on Aadhaar data analysis using different distributed computing frameworks like MapReduce, Hive, and Apache Spark on top of Hadoop that could be used for the purpose of better decision-making by all government firms and we provide the valid conclusion that Apache Spark framework is efficient in terms of performance.
- Copyright
- © 2021, the Authors. Published by Atlantis Press.
- Open Access
- This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).
Cite this article
TY - CONF AU - R Roopa AU - Varsha Ryali AU - Tejasvi Shrivastava AU - Syed Mahmood Nabeel Anwar PY - 2021 DA - 2021/09/13 TI - Aadhaar Data Analysis Comparison in MapReduce, Hive and Spark BT - Proceedings of the 3rd International Conference on Integrated Intelligent Computing Communication & Security (ICIIC 2021) PB - Atlantis Press SP - 286 EP - 295 SN - 2589-4900 UR - https://doi.org/10.2991/ahis.k.210913.036 DO - 10.2991/ahis.k.210913.036 ID - Roopa2021 ER -