2003 Articles
Multi-channel Source Separation by Factorial HMMs
We present a new speaker-separation algorithm for separating signals with known statistical characteristics from mixed multi-channel recordings. Speaker separation has conventionally been treated as a problem of blind source separation (BSS). This approach does not utilize any knowledge of the statistical characteristics of the signals to be separated, relying mainly on the independence between the various signals to separate them. We present an algorithm that utilizes detailed statistical information about the signals to be separated, represented in the form of hidden Markov models (HMM). We treat the signal separation problem as one of beamforming, where each signal is extracted using a filter-and-sum array. The filters are estimated to maximize the likelihood of the summed output, measured on the HMM for the desired signal. This is done by iteratively estimating the best state sequence through the HMM from a factorial HMM (FHMM) that is the cross-product of the HMMs for the multiple signals, using the current output of the array, and estimating the filters to maximize the likelihood of that state sequence. Experiments show that the proposed method can cleanly extract a background speaker who is 20 dB below the foreground speaker in a two-speaker mixture, when the HMMs for the signals are constructed from knowledge of the utterance transcriptions.
Subjects
Files
-
icassp03-fhmm.pdf application/pdf 289 KB Download File
Also Published In
- Title
- 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing: Proceedings: April 6-10, 2003, Hong Kong Exhibition and Convention Centre, Hong Kong
- Publisher
- IEEE
- DOI
- https://doi.org/10.1109/ICASSP.2003.1198868
More About This Work
- Academic Units
- Electrical Engineering
- Published Here
- June 29, 2012