Research On Prosody Conversion of Affective Speech Based on LIBSVM and PAD Three Dimensional Emotion Model
- DOI
- 10.2991/ssehr-16.2016.181How to use a DOI?
- Keywords
- PAD emotion model, five-scale tone model, Library for Support Vector Machines(LIBSVM)support vector regression, generalized regression neural network, Prosody Conversion.
- Abstract
This paper proposes a framework for prosody conversion of emotional speech based on LIBSVM support vector regression model and PAD three dimensional emotion model. We design an emotional speech corpus including 11 kinds of emotional utterances. Each utterance is labeled the emotional information with PAD value. A five-scale tone model is employed to model the pitch contour of emotional speech at the syllable level. A LIBSVM SVR-based prosody conversion model is proposed to realize the transformation of pitch contour, duration and pause duration of emotional speech according to the PAD values of emotion and context information of text. Speech is then re-synthesized with the STRAIGHT algorithm by modifying pitch contour, duration and pause duration, and is compared with the results obtained by the generalized regression neural network. Experimental results show that the modified speech achieves 3.8 of average Emotional Mean Opining Score (EMOS).
- Copyright
- © 2016, the Authors. Published by Atlantis Press.
- Open Access
- This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).
Cite this article
TY - CONF AU - Xiaoyong Lu AU - Tao Pan PY - 2016/07 DA - 2016/07 TI - Research On Prosody Conversion of Affective Speech Based on LIBSVM and PAD Three Dimensional Emotion Model BT - Proceedings of 2016 5th International Conference on Social Science, Education and Humanities Research PB - Atlantis Press SP - 851 EP - 858 SN - 2352-5398 UR - https://doi.org/10.2991/ssehr-16.2016.181 DO - 10.2991/ssehr-16.2016.181 ID - Lu2016/07 ER -