The Use of yig-cha and chos-kyi-rnam-grangs in Computing Lexical Cohesion for Tibetan Topic Boundary Detection
To properly implement a simple Tibetan Information Retrieval (IR) system segmentation of one form or another (n-gram, POS-tagging, dictionary substring matching, etc.) must be performed (see Hackett (2000b)). To take Tibetan indexing to a more sophisticated level however, some form of topic detection must be employed. This paper reports the results of a pilot study on the application to Tibetan of one technique for topic boundary detection: Lexical Cohesion. The resources developed and deployed, the theoretical model used, and its potential applications are discussed.
- IATS-XII_Hackett_paper.pdf application/pdf 1.4 MB Download File
More About This Work
- Academic Units
- American Institute of Buddhist Studies
- Published Here
- May 31, 2011
Proceedings of the 12th Seminar of the International Association for Tibetan Studies, Vancouver, BC, August 15-21, 2010 — Information Technology Panel.