Speakers Identification Using Diarization Techniques
- DOI
- 10.2991/978-94-6463-136-4_80How to use a DOI?
- Keywords
- Speaker Diarization; End-to-End Neural Diarization(EEND); Mel Frequency Cepstrum Coefficients (MFCC); Generative Adversarial Networks (GANs); Hidden Markov Model (HMM)
- Abstract
Research work analyses speaker voice identification and voice separation development methodologies and show an overview of the findings. Several speech recognition methods, such as Mel Frequency Cepstrum Coefficients (MFCC), Vector Quantization (VQ), Hidden Markov Model (HMM), Long Short-Term Memory (LSTM), End-to-End Neural Diarization (EEND), Generative Adversarial Networks (GANs), Convolutional Neural Networks, and Audio Embeddiment, can be used for adaptive processing with multiple speakers identification in audio data. Additionally, we addressed the uses of speaker diarization, the potential for future development, and the databases used to evaluate diarization systems.
The speaker diarization method consists of seven steps, including input, front-end processing, speech activity detection, segmentation, speaker embedding, clustering post-processing, and output.
Speaker identification recognizes speakers during an audio conversion, a kind of speech recognition. Diarization of the speaker is a way of recognizing the speaker in a multi-speaker audio file. And The procedure of identifying who talks when in an audio recording is known as speaker diarization. The audio file includes information from conferences, broadcast news, and any other public gathering with many speakers.
- Copyright
- © 2023 The Author(s)
- Open Access
- Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
Cite this article
TY - CONF AU - Vinod K. Pande AU - Vijay K. Kale PY - 2023 DA - 2023/05/01 TI - Speakers Identification Using Diarization Techniques BT - Proceedings of the International Conference on Applications of Machine Intelligence and Data Analytics (ICAMIDA 2022) PB - Atlantis Press SP - 905 EP - 915 SN - 2352-538X UR - https://doi.org/10.2991/978-94-6463-136-4_80 DO - 10.2991/978-94-6463-136-4_80 ID - Pande2023 ER -