2000 Presentations (Communicative Events)
Evaluation of automatically identified index terms for browsing electronic documents
We present an evaluation of domainindependent natural language tools for use in the identification of significant concepts in documents. Using qualitative evaluation, we compare three shallow processing methods for extracting index terms, i.e., terms that can be used to model the content of documents. We focus on two criteria: quality and coverage. In terms of quality alone, our results show that technical term (TT) extraction [Justeson and Katz 1995] receives the highest rating. However, in terms of a combined quality and coverage metric, the Head Sorting (HS) method, described in [Wacholder 1998], outperforms both other methods, keyword (KW) and TT.
Subjects
Files
- wacholder_al_00.pdf application/pdf 556 KB Download File
More About This Work
- Academic Units
- Computer Science
- Publisher
- Proceedings of the Conference NAACL/ANLP2000, Association for Computational Linguistics
- Published Here
- May 3, 2013