A Novel Unit Selection and Unit Smoothing Method for Chinese Concatenation Speech
- DOI
- 10.2991/masta-19.2019.47How to use a DOI?
- Keywords
- Statistical parametric speech synthesis, LSTM, DTW, Concatenation smoothing
- Abstract
This paper introduces a new approach to unit selection and unit concatenation, in which Chinese character is the smallest unit in speech corpus and at concatenation stage, speech segments are not only concatenated in phase, but also in amplitude. A conventional hybrid system is used in this paper. Firstly, LSTM were adopted for acoustic model and duration model, and prosody is predicted by Conditional Random Fields (CRFs). Secondly, without considering continuously-valued cost, we use Dynamic Time Warping (DTW) directly to select units with acoustic features such as mel-cepstrum and Fundamental Frequency (F0). At last, an improved cross-fade method taking amplitude into account is adopted in waveform concatenation to improve smoothing and very natural speech is synthesized.
- Copyright
- © 2019, the Authors. Published by Atlantis Press.
- Open Access
- This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).
Cite this article
TY - CONF AU - Xiao-kang Yang AU - Zhi-cheng Liu AU - Qi-long Sun AU - Hao-yuan Wang PY - 2019/07 DA - 2019/07 TI - A Novel Unit Selection and Unit Smoothing Method for Chinese Concatenation Speech BT - Proceedings of the 2019 International Conference on Modeling, Analysis, Simulation Technologies and Applications (MASTA 2019) PB - Atlantis Press SP - 280 EP - 285 SN - 1951-6851 UR - https://doi.org/10.2991/masta-19.2019.47 DO - 10.2991/masta-19.2019.47 ID - Yang2019/07 ER -