Real-time Object Detection and Voice Labeling for Enhanced Accessibility and Visual Interaction
- DOI
- 10.2991/978-94-6463-471-6_70How to use a DOI?
- Keywords
- YOLO Version 7; Voice Labeling; Neural Network; Visual Interaction; Small Object Handling; Pre-trained Model; Versatility; Fast Interference
- Abstract
This work introduces a new approach to real-time object recognition using YOLO Version 7, an advanced system capable of real-time object detection in images, videos, as well as live webcam feeds. Unlike traditional methods, this system verbally discusses everything it finds, including the object’s name and the accuracy and confidence levels of the algorithm. Apart from enhancing accessibility, computers may also be leveraged to develop educational and engaging resources. Using the MS COCO dataset and a pre-trained model, YOLO Version 7 ensures accurate and speedy object recognition, even for small objects. By using the speed and precision of the system, the initiative aims to make information less intimidating and engaging, particularly for individuals with visual impairments. The dataset ensures comprehensive evaluations with 118,287 training shots, 5,000 validating images, and 20,288 assessment images spanning 80 object classes. The following are the advantages of the proposed method: speed, accuracy, increased visual interactions, faster and less interference, flexibility in all situations, accurate and quick item recognition, and improved handling of small objects. The solution gathers data from several sources, including cameras and picture/video files, and recognizes objects using the YOLO Version 7 algorithm.
- Copyright
- © 2024 The Author(s)
- Open Access
- Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
Cite this article
TY - CONF AU - Matta Swathi AU - Ramala Supraja AU - Malavathu Lakshmi Prasanna AU - Shaik Sameer AU - Guntaka Rama Krishna Reddy PY - 2024 DA - 2024/07/30 TI - Real-time Object Detection and Voice Labeling for Enhanced Accessibility and Visual Interaction BT - Proceedings of the International Conference on Computational Innovations and Emerging Trends (ICCIET- 2024) PB - Atlantis Press SP - 721 EP - 733 SN - 2352-538X UR - https://doi.org/10.2991/978-94-6463-471-6_70 DO - 10.2991/978-94-6463-471-6_70 ID - Swathi2024 ER -