Real-time Object Detection and Voice Labeling for Enhanced Accessibility and Visual Interaction

Matta Swathi; Ramala Supraja; Malavathu Lakshmi Prasanna; Shaik Sameer; Guntaka Rama Krishna Reddy

doi:10.2991/978-94-6463-471-6_70

<Previous Article In Volume

Next Article In Volume>

Real-time Object Detection and Voice Labeling for Enhanced Accessibility and Visual Interaction

Authors

Matta Swathi¹, Ramala Supraja¹^{, *}, Malavathu Lakshmi Prasanna¹, Shaik Sameer¹, Guntaka Rama Krishna Reddy¹

¹Lakireddy Bali Reddy College of Engineering, Mylavaram, India

^*Corresponding author. Email: supraja123.ramala@gmail.com

Corresponding Author

Ramala Supraja

Available Online 30 July 2024.

DOI: 10.2991/978-94-6463-471-6_70 How to use a DOI?
Keywords: YOLO Version 7; Voice Labeling; Neural Network; Visual Interaction; Small Object Handling; Pre-trained Model; Versatility; Fast Interference
Abstract: This work introduces a new approach to real-time object recognition using YOLO Version 7, an advanced system capable of real-time object detection in images, videos, as well as live webcam feeds. Unlike traditional methods, this system verbally discusses everything it finds, including the object’s name and the accuracy and confidence levels of the algorithm. Apart from enhancing accessibility, computers may also be leveraged to develop educational and engaging resources. Using the MS COCO dataset and a pre-trained model, YOLO Version 7 ensures accurate and speedy object recognition, even for small objects. By using the speed and precision of the system, the initiative aims to make information less intimidating and engaging, particularly for individuals with visual impairments. The dataset ensures comprehensive evaluations with 118,287 training shots, 5,000 validating images, and 20,288 assessment images spanning 80 object classes. The following are the advantages of the proposed method: speed, accuracy, increased visual interactions, faster and less interference, flexibility in all situations, accurate and quick item recognition, and improved handling of small objects. The solution gathers data from several sources, including cameras and picture/video files, and recognizes objects using the YOLO Version 7 algorithm.
Copyright: © 2024 The Author(s)
Open Access: Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Download article (PDF)

<Previous Article In Volume

Next Article In Volume>

Volume Title: Proceedings of the International Conference on Computational Innovations and Emerging Trends (ICCIET- 2024)
Series: Advances in Computer Science Research
Publication Date: 30 July 2024
ISBN: 978-94-6463-471-6
ISSN: 2352-538X
DOI: 10.2991/978-94-6463-471-6_70 How to use a DOI?
Open Access: Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Cite this article

ris enw bib

TY  - CONF
AU  - Matta Swathi
AU  - Ramala Supraja
AU  - Malavathu Lakshmi Prasanna
AU  - Shaik Sameer
AU  - Guntaka Rama Krishna Reddy
PY  - 2024
DA  - 2024/07/30
TI  - Real-time Object Detection and Voice Labeling for Enhanced Accessibility and Visual Interaction
BT  - Proceedings of the International Conference on Computational Innovations and Emerging Trends (ICCIET- 2024)
PB  - Atlantis Press
SP  - 721
EP  - 733
SN  - 2352-538X
UR  - https://doi.org/10.2991/978-94-6463-471-6_70
DO  - 10.2991/978-94-6463-471-6_70
ID  - Swathi2024
ER  -

download .riscopy to clipboard