Intelligent Bot for Excel Data Deduplication: A Cutting-Edge Approach to Eliminate Duplicate Entries
- DOI
- 10.2991/978-94-6463-471-6_31How to use a DOI?
- Keywords
- Duplicate Entries; Data Quality; AI-bot; Machine Learning
- Abstract
Matching is crucial for data deduplication, but it's challenging due to inconsistent and incomplete data. Intelligent NLP algorithms are needed for unstructured data, and large datasets require strong technology. Machine learning approaches, sophisticated algorithms, and predictive analytics are necessary for robust deduplication solutions [2]. An AI-driven bot can automatically identify and merge duplicate Excel spreadsheets, ensuring clean, correct data and saving time. This AI-bot analyzes data types, applies matching logic, and allows flexible duplicate definition. It uses fuzzy algorithms to identify similar text entries, adjusts matching thresholds, merges duplicate rows intelligently, and presents merged rows for user confirmation.
- Copyright
- © 2024 The Author(s)
- Open Access
- Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
Cite this article
TY - CONF AU - A. V. Sriharsha AU - Shaik Naziya Fathima AU - M. Nikhitha AU - B. Tarun Kumar AU - P. Penchal Mohan Pranay PY - 2024 DA - 2024/07/30 TI - Intelligent Bot for Excel Data Deduplication: A Cutting-Edge Approach to Eliminate Duplicate Entries BT - Proceedings of the International Conference on Computational Innovations and Emerging Trends (ICCIET- 2024) PB - Atlantis Press SP - 314 EP - 324 SN - 2352-538X UR - https://doi.org/10.2991/978-94-6463-471-6_31 DO - 10.2991/978-94-6463-471-6_31 ID - Sriharsha2024 ER -