2011 Presentations (Communicative Events)
Dialect and Accent Recognition using Phonetic-Segmentation Supervectors
We describe a new approach to automatic dialect and accent recognition which exceeds state-of-the-art performance in three recognition tasks. This approach improves the accuracy and substantially lower the time complexity of our earlier phonetic based kernel approach for dialect recognition. In contrast to state-of-the-art acoustic-based systems, our approach employs phone labels and segmentation to constrain the acoustic models. Given a speaker’s utterance, we first obtain phone hypotheses using a phone recognizer and then extract GMM-supervectors for each phone type, effectively summarizing the speaker’s phonetic characteristics in a single vector of phone-type supervectors. Using these vectors, we design a kernel function that computes the phonetic similarities between pairs of utterances to train SVM classifiers to identify dialects. Comparing this approach to the state-of-the-art, we obtain a 12.9% relative improvement in EER on Arabic dialects, and a 17.9% relative improvement for American vs. Indian English dialects. We also see a 53.5% relative improvement over a GMM-UBM on American Southern vs. Non-Southern English.
- 0585anav.pdf application/pdf 152 KB Download File