The Use of yig-cha and chos-kyi-rnam-grangs in Computing Lexical Cohesion for Tibetan Topic Boundary Detection

Hackett, Paul G.

To properly implement a simple Tibetan Information Retrieval (IR) system segmentation of one form or another (n-gram, POS-tagging, dictionary substring matching, etc.) must be performed (see Hackett (2000b)). To take Tibetan indexing to a more sophisticated level however, some form of topic detection must be employed. This paper reports the results of a pilot study on the application to Tibetan of one technique for topic boundary detection: Lexical Cohesion. The resources developed and deployed, the theoretical model used, and its potential applications are discussed.



  • thumnail for IATS-XII_Hackett_paper.pdf IATS-XII_Hackett_paper.pdf application/pdf 1.4 MB Download File

More About This Work

Academic Units
American Institute of Buddhist Studies
Published Here
May 31, 2011


Proceedings of the 12th Seminar of the International Association for Tibetan Studies, Vancouver, BC, August 15-21, 2010 — Information Technology Panel.