Selection of Optimal Solution for Example and Model of Retrieval Based Voice Conversion
- DOI
- 10.2991/978-94-6463-370-2_48How to use a DOI?
- Keywords
- Timbre conversion; Mel cepstral distortion,Model training; Objective evaluation; Subjective evaluation
- Abstract
Since 2010, the computer has been developing continuously in the field of speech conversion, and now the speech-to-text technology has become mature, but the development of timbre conversion and imitation is not perfect. Recently a new tone imitation program has become a focus, but this program model training options are still lacking. This paper hopes to train the model through the in-depth practical operation of this program and the custom value in the model training step of this program. Multiple training processes of Retrieval Based Voice Conversion (RVC) model will be practiced, and the timbour produced by the model with different number of rounds will be compared with the sound source. After the model training, two evaluation methods were used to check the similarity of the evaluation model. One is the objective evaluation method based on Mel cepstral distortion principle, which is realized by software. The other is a subjective evaluation method based on the principle of directly collecting human sensory data. The similarity statistics are obtained respectively, the selection criteria of the general optimal solution model are obtained, and the relative standard training reference values are provided for users.
- Copyright
- © 2024 The Author(s)
- Open Access
- Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
Cite this article
TY - CONF AU - Zhongxi Ren PY - 2024 DA - 2024/02/14 TI - Selection of Optimal Solution for Example and Model of Retrieval Based Voice Conversion BT - Proceedings of the 2023 International Conference on Data Science, Advanced Algorithm and Intelligent Computing (DAI 2023) PB - Atlantis Press SP - 468 EP - 475 SN - 1951-6851 UR - https://doi.org/10.2991/978-94-6463-370-2_48 DO - 10.2991/978-94-6463-370-2_48 ID - Ren2024 ER -