Academic Commons

Articles

Comparison of Word-Based and Syllable-Based Retrieval for Tibetan

Hackett, Paul G.; Oard, Douglas W.

Tibetan retrieval based on automatically segmented words is compared with the use of overlapping syllable n-grams using a known-item retrieval evaluation. The optimal span of fixed-length n-grams is found to be 2 syllables, and indexing words is found to be as effective as indexing syllable bigrams.

Subjects

Files

Also Published In

Title
Proceedings of the Fifth International Workshop on Information Retrieval with Asian Languages: September 30 to October 2, 2000, Hong Kong
Publisher
Association for Computing Machinery
DOI
https://doi.org/10.1145/355214.355242

More About This Work

Academic Units
American Institute of Buddhist Studies
Published Here
May 27, 2011