Academic Commons

Reports

Using Prosodic Features of Speech and Audio Localization in Graphical User Interfaces

Olwal, Alex; Feiner, Steven K.

We describe several approaches for using prosodic features of speech and audio localization to control inter-active applications. This information can be used for parameter control, as well as for disambiguating speech recognition. We discuss how characteristics of the spoken sentences can be exploited in the user interface; for example, by considering the speed with which the sentence was spoken and the presence of extraneous utterances. We also show how coarse audio localization can be used for low-fidelity gesture tracking, by inferring the speaker's head position.

Subjects

Files

More About This Work

Academic Units
Computer Science
Publisher
Department of Computer Science, Columbia University
Series
Columbia University Computer Science Technical Reports, CUCS-016-03
Published Here
April 26, 2011
Academic Commons provides global access to research and scholarship produced at Columbia University, Barnard College, Teachers College, Union Theological Seminary and Jewish Theological Seminary. Academic Commons is managed by the Columbia University Libraries.