Presentations (Communicative Events)

Cross-Language Phrase Boundary Detection

Soto Martinez, Victor; Cooper, Erica L.; Rosenberg, Andrew; Hirschberg, Julia Bell

We describe models of prosodic phrasing trained on multiple languages to identify boundaries in an unseen language. Our goal is to create models from High Resource languages, in which hand-annotated prosodic phrase boundaries are available, to use in identifying boundaries in a Low Resource language, with little or no training material. We train models on American English, Italian, Mandarin, and German and test on each of these languages. We find that, while pause is the most important feature for phrase boundary prediction in all languages examined, the role of pause in boundary identification varies by annotator and the relative importance of other features varies significantly by language. We also find that different acoustic correlates of prosodic boundaries characterize different languages. In some, the relative importance of features is silence is greater than pitch is greater than intensity is greater than duration, while for other languages intensity is more important than pitch. These differences do not appear to be attributable to language family, since, e.g. English and German display different patterns.

Files

  • thumnail for soto-Columbia-icassp13-v2.pdf soto-Columbia-icassp13-v2.pdf application/pdf 148 KB Download File

More About This Work

Academic Units
Computer Science
Publisher
Proceedings of ICASSP, 2013
Published Here
August 2, 2013

Notes

Poster for this presentation available at http://hdl.handle.net/10022/AC:P:21200