Model-Based Monaural Source Separation Using a Vector-Quantized Phase-Vocoder Representation

Ellis, Daniel P. W.; Weiss, Ron J.

A vector quantizer (VQ) trained on short-time frames of a particular source can form an accurate non-parametric model of that source. This principle has been used in several previous source separation and enhancement schemes as a basis for filtering the original mixture. In this paper, we propose the "projection" of a corrupted target signal onto the constrained space represented by the model as a viable model for source separation. We investigate some parameters of VQ encoding, including a more perceptually-motivated distance measure, and an encoding of phase derivatives that supports reconstruction directly from quantizer output alone. For the problem of separating speech from noise, we highlight some problems with this approach, including the need for sequential constraints (which we introduce with a simple hidden Markov model), and choices for choosing the best quantization for over-lapping sources.


Also Published In

2006 IEEE International Conference on Acoustics, Speech, and Signal Processing: Proceedings: May 14-19, 2006, Centre de Congrès Pierre Baudis, Toulouse, France

More About This Work

Academic Units
Electrical Engineering
Published Here
June 28, 2012