Speaker turn segmentation based on between-channel differences
- Speaker turn segmentation based on between-channel differences
- Ellis, Daniel P. W.
Liu, Jerry C.
- Electrical Engineering
- Persistent URL:
- Book/Journal Title:
- NIST ICASSP 2004 Meeting Recognition Workshop, Montreal
- National Institute of Standards and Technology
- Multichannel recordings of meetings provide information on speaker locations in the timing and level differences between microphones. We have been experimenting with cross-correlation and energy differences as features to identify and segment speaker turns. In particular, we have used LPC whitening, spectral-domain cross-correlation, and dynamic programming to sharpen and disambiguate timing differences between mic channels that may be dominated by noise and reverberation. These cues are classified into individual speakers using spectral clustering (i.e. defined by the top eignenvectors of a similarity matrix). We show that this technique is largely robust to precise details of mic positioning etc., and can be used with some success with data collected from a number of different setups, as provided by the NIST 2004 Meetings evaluation.
- Electrical engineering
- Item views
text | xml
- Suggested Citation:
- Daniel P. W. Ellis, Jerry C. Liu, 2004, Speaker turn segmentation based on between-channel differences, Columbia University Academic Commons, https://doi.org/10.7916/D8S75RPJ.