2002 Presentations (Communicative Events)
Learning Anchor Verbs for Biological Interaction Patterns from Published Text Articles
Much of knowledge modeling in the molecular biology domain involves interactions between proteins, genes, various forms of RNA, small molecules, etc. Interactions between these substances are typically extracted and codified manually, increasing the cost and time for modeling and substantially limiting the coverage of the resulting knowledge base. In this paper, we describe an automatic system that learns from text interaction verbs; these verbs can then form the core of automatically retrieved patterns which model classes of biological interactions. We investigate text features relating verbs with genes and proteins, and apply statistical tests and a logistic regression statistical model to determine whether a given verb belongs to the class of interaction verbs. Our system, AVAD, achieves over 87% precision and 82% recall when tested on an 11 million word corpus of journal articles. In addition, we compare the automatically obtained results with a manually constructed database of interaction verbs and show that the automatic approach can significantly enrich the manual list by detecting rarer interaction verbs that were omitted from the database.
Subjects
Files
- hatzivassiloglou_weng_02.pdf application/pdf 109 KB Download File
More About This Work
- Academic Units
- Computer Science
- Publisher
- Proceedings of the Workshop on Natural Language Processing in Biomedical Applications
- Published Here
- May 10, 2013