Proceedings of the International Conference on Computational Innovations and Emerging Trends (ICCIET- 2024)

Intelligent Bot for Excel Data Deduplication: A Cutting-Edge Approach to Eliminate Duplicate Entries

Authors
A. V. Sriharsha1, *, Shaik Naziya Fathima1, M. Nikhitha1, B. Tarun Kumar1, P. Penchal Mohan Pranay1
1Mohan Babu University, Tirupathi, India
*Corresponding author. Email: avsreeharsha@gmail.com
Corresponding Author
A. V. Sriharsha
Available Online 30 July 2024.
DOI
10.2991/978-94-6463-471-6_31How to use a DOI?
Keywords
Duplicate Entries; Data Quality; AI-bot; Machine Learning
Abstract

Matching is crucial for data deduplication, but it's challenging due to inconsistent and incomplete data. Intelligent NLP algorithms are needed for unstructured data, and large datasets require strong technology. Machine learning approaches, sophisticated algorithms, and predictive analytics are necessary for robust deduplication solutions [2]. An AI-driven bot can automatically identify and merge duplicate Excel spreadsheets, ensuring clean, correct data and saving time. This AI-bot analyzes data types, applies matching logic, and allows flexible duplicate definition. It uses fuzzy algorithms to identify similar text entries, adjusts matching thresholds, merges duplicate rows intelligently, and presents merged rows for user confirmation.

Copyright
© 2024 The Author(s)
Open Access
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Download article (PDF)

Volume Title
Proceedings of the International Conference on Computational Innovations and Emerging Trends (ICCIET- 2024)
Series
Advances in Computer Science Research
Publication Date
30 July 2024
ISBN
10.2991/978-94-6463-471-6_31
ISSN
2352-538X
DOI
10.2991/978-94-6463-471-6_31How to use a DOI?
Copyright
© 2024 The Author(s)
Open Access
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Cite this article

TY  - CONF
AU  - A. V. Sriharsha
AU  - Shaik Naziya Fathima
AU  - M. Nikhitha
AU  - B. Tarun Kumar
AU  - P. Penchal Mohan Pranay
PY  - 2024
DA  - 2024/07/30
TI  - Intelligent Bot for Excel Data Deduplication: A Cutting-Edge Approach to Eliminate Duplicate Entries
BT  - Proceedings of the International Conference on Computational Innovations and Emerging Trends (ICCIET- 2024)
PB  - Atlantis Press
SP  - 314
EP  - 324
SN  - 2352-538X
UR  - https://doi.org/10.2991/978-94-6463-471-6_31
DO  - 10.2991/978-94-6463-471-6_31
ID  - Sriharsha2024
ER  -