Presentations (Communicative Events)

Learning Anchor Verbs for Biological Interaction Patterns from Published Text Articles

Hatzivassiloglou, Vasileios; Weng, Wubin

Much of knowledge modeling in the molecular biology domain involves interactions between proteins, genes, various forms of RNA, small molecules, etc. Interactions between these substances are typically extracted and codified manually, increasing the cost and time for modeling and substantially limiting the coverage of the resulting knowledge base. In this paper, we describe an automatic system that learns from text interaction verbs; these verbs can then form the core of automatically retrieved patterns which model classes of biological interactions. We investigate text features relating verbs with genes and proteins, and apply statistical tests and a logistic regression statistical model to determine whether a given verb belongs to the class of interaction verbs. Our system, AVAD, achieves over 87% precision and 82% recall when tested on an 11 million word corpus of journal articles. In addition, we compare the automatically obtained results with a manually constructed database of interaction verbs and show that the automatic approach can significantly enrich the manual list by detecting rarer interaction verbs that were omitted from the database.

Files

  • thumnail for hatzivassiloglou_weng_02.pdf hatzivassiloglou_weng_02.pdf application/pdf 109 KB Download File

More About This Work

Academic Units
Computer Science
Publisher
Proceedings of the Workshop on Natural Language Processing in Biomedical Applications
Published Here
May 10, 2013