Academic Commons

Articles

Learning Auditory Models of Machine Voices

Dobson, Kelly; Whitman, Brian; Ellis, Daniel P. W.

Vocal imitation is often found useful in machine therapy sessions as it creates an emphatic relational bridge between human and machine. The feedback of the machine directly responding to the person's imitation can strengthen the trust of this connection. However, vocal imitation of machines often bear little resemblance to the target due to physiological limitations. In practice, we need a way to detect human vocalization of machine sounds that can generalize to new machines. In this study we learn the relationship between vocal imitation of machine sounds and the target sounds to create a predictive model of vocalization of otherwise humanly impossible sounds. After training on a small set of machines and their imitations, we predict the correct target of a new set of imitations with high accuracy. The model outperforms distance metrics between human and machine sounds on the same task and takes into account auditory perception and constraints in vocal expression.

Files

Also Published In

Title
2005 Workshop on Applications Signal Processing to Audio and Acoustics (WASPAA), October 16-19, 2005, New Paltz, NY
DOI
https://doi.org/10.1109/ASPAA.2005.1540238

More About This Work

Academic Units
Electrical Engineering
Publisher
IEEE
Published Here
June 28, 2012
Academic Commons provides global access to research and scholarship produced at Columbia University, Barnard College, Teachers College, Union Theological Seminary and Jewish Theological Seminary. Academic Commons is managed by the Columbia University Libraries.