Stylization of Pitch with Syllable-Based Linear Segments

Ravuri, Suman; Ellis, Daniel P. W.

Fundamental frequency contours for speech, as obtained by common pitch tracking algorithms, contain a great deal of fine detail that is unlikely to hold much perceptual significance for listeners. In our experiments, a radically reduced pitch contour consisting of a single linear segment for each syllable was found to judged as equally natural as the original pitch track by listeners, based on high-quality analysis-synthesis. We describe the algorithms both for segmenting speech into syllables based on fitting Gaussians to the energy envelope, and for approximating the pitch contour by independent linear segments for each syllable. We report our web-based test in which 40 listeners compared the stylized pitch contour resyntheses to equivalent resyntheses based on the original pitch track, and also to pitch tracks stylized by the existing Momel algorithm. Listeners preferred the original pitch contour to the linear approximation in only 60% of cases, where 50% would indicate random guessing. By contrast, the original was preferred over Momel in 74% of cases.


Also Published In

2008 IEEE International Conference on Acoustics, Speech, and Signal Processing: ICASSP '08: Proceedings: March 30-April 4, 2008 Caesars Palace Las Vegas, Nevada, U.S.A.

More About This Work

Academic Units
Electrical Engineering
Published Here
June 27, 2012