NMF based speech and music separation in monaural speech recordings with sparseness and temporal continuity constraints
- DOI
- 10.2991/icmt-13.2013.67How to use a DOI?
- Keywords
- non-negative matrix factorization·speech and music separation·sparse coding·temporal continuity·semi-supervised learning
- Abstract
This paper proposes a semi-supervised approach of speech and music separation in monaural speech recordings based on non-negative matrix factorization (NMF). Considering the scenario that the genre of background music is known, music basis vectors are randomly picked from the magnitude of short time fourier transform (STFT) of training music, while speech basis vectors are estimated by executing NMF on the magnitude of STFT of polluted speech signal. Moreover, we apply sparseness and temporal continuity constraints to speech and music respectively and evaluate how different constraints can in uence the separation performance. The test set contains 10 Mandarin speech utterances from 10 speakers mixed with music in different speech-music ratios (SMR). The baseline is semi-supervised separation system with no constraint. The results reveal that adding temporal continuity constraint can improve the separation performance compared with the baseline and separation system with only sparseness constraint.
- Copyright
- © 2013, the Authors. Published by Atlantis Press.
- Open Access
- This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).
Cite this article
TY - CONF AU - Tu Ming AU - Xie Xiang AU - Jiao Yishan PY - 2013/11 DA - 2013/11 TI - NMF based speech and music separation in monaural speech recordings with sparseness and temporal continuity constraints BT - Proceedings of 3rd International Conference on Multimedia Technology(ICMT-13) PB - Atlantis Press SP - 541 EP - 548 SN - 1951-6851 UR - https://doi.org/10.2991/icmt-13.2013.67 DO - 10.2991/icmt-13.2013.67 ID - Ming2013/11 ER -