Academic Commons

Articles

A Simple Correlation-Based Model of Intelligibility for Nonlinear Speech Enhancement and Separation

Boldt, Jesper B.; Ellis, Daniel P. W.

Applying a binary mask to a pure noise signal can result in speech that is highly intelligible, despite the absence of any of the target speech signal. Therefore, to estimate the intelligibility benefit of highly nonlinear speech enhancement techniques, we contend that SNR is not useful; instead we propose a measure based on the similarity between the time-varying spectral envelopes of target speech and system output, as measured by correlation. As with previous correlation-based intelligibility measures, our system can broadly match subjective intelligibility for a range of enhanced signals. Our system, however, is notably simpler and we explain the practical motivation behind each stage. This measure, freely available as a small Matlab implementation, can provide a more meaningful evaluation measure for nonlinear speech enhancement systems, as well as providing a transparent objective function for the optimization of such systems.

Files

Also Published In

Title
EUSIPCO 2009: 17th European Signal Processing Conference, August 24-28, 2009, Glasgow, Scotland
Publisher
European Association for Signal, Speech, and Image Processing

More About This Work

Academic Units
Electrical Engineering
Published Here
June 26, 2012
Academic Commons provides global access to research and scholarship produced at Columbia University, Barnard College, Teachers College, Union Theological Seminary and Jewish Theological Seminary. Academic Commons is managed by the Columbia University Libraries.