Multi-channel Source Separation by Factorial HMMs

Reyes-Gomez, Manuel; Raj, Bhiksha; Ellis, Daniel P. W.

We present a new speaker-separation algorithm for separating signals with known statistical characteristics from mixed multi-channel recordings. Speaker separation has conventionally been treated as a problem of blind source separation (BSS). This approach does not utilize any knowledge of the statistical characteristics of the signals to be separated, relying mainly on the independence between the various signals to separate them. We present an algorithm that utilizes detailed statistical information about the signals to be separated, represented in the form of hidden Markov models (HMM). We treat the signal separation problem as one of beamforming, where each signal is extracted using a filter-and-sum array. The filters are estimated to maximize the likelihood of the summed output, measured on the HMM for the desired signal. This is done by iteratively estimating the best state sequence through the HMM from a factorial HMM (FHMM) that is the cross-product of the HMMs for the multiple signals, using the current output of the array, and estimating the filters to maximize the likelihood of that state sequence. Experiments show that the proposed method can cleanly extract a background speaker who is 20 dB below the foreground speaker in a two-speaker mixture, when the HMMs for the signals are constructed from knowledge of the utterance transcriptions.


Also Published In

2003 IEEE International Conference on Acoustics, Speech, and Signal Processing: Proceedings: April 6-10, 2003, Hong Kong Exhibition and Convention Centre, Hong Kong

More About This Work

Academic Units
Electrical Engineering
Published Here
June 29, 2012