2006 Presentations (Communicative Events)
Automatic Creation of Domain Templates
Recently, many Natural Language Processing (NLP) applications have improved the quality of their output by using various machine learning techniques to mine Information Extraction (IE) patterns for capturing information from the input text. Currently, to mine IE patterns one should know in advance the type of the information that should be captured by these patterns. In this work we propose a novel methodology for corpus analysis based on cross-examination of several document collections representing different instances of the same domain. We show that this methodology can be used for automatic domain template creation. As the problem of automatic domain template creation is rather new, there is no well-defined procedure for the evaluation of the domain template quality. Thus, we propose a methodology for identifying what information should be present in the template. Using this information we evaluate the automatically created domain templates through the text snippets retrieved according to the created templates.
Subjects
Files
-
filatova_al_06.pdf application/pdf 168 KB Download File
More About This Work
- Academic Units
- Computer Science
- Publisher
- Proceedings of ACL-COLING
- Published Here
- June 30, 2013