Dictionaries for Language Generation Accounting for Co-occurrence Knowledge

Frank A. Smadja

Dictionaries for Language Generation Accounting for Co-occurrence Knowledge
Smadja, Frank A.
Technical reports
Computer Science
Permanent URL:
Columbia University Computer Science Technical Reports
Part Number:
Department of Computer Science, Columbia University
Publisher Location:
New York
Many wording choices in English sentences cannot be accounted for on semantic or syntactic grounds. They can only be expressed in terms of co-occurrence relations. Co-occurrence knowledge has been traditionally overlooked in the past, but should be included in lexicons as it is an inherent part of the language. In this paper, we demonstrate the importance of co-occurrence knowledge for language generation and we show how to include it in computational dictionaries. Using co-occurrence knowledge in the dictionary provides the generator with the information necessary for handling many lexical decisions that were previously ignored. We focus here on the process of building the dictionary, and we show how co-occurrence knowledge can be systematically entered in lexicons. Lexical relations are first identified by a co-occurrence compiler, EXTRACT. Then, domain specific semantic information is used as a criterion for classifying them. We exemplify our approach in the banking domain and we explain how it can be used by a natural language generator.
Computer science
Item views:
text | xml
Suggested Citation:
Frank A. Smadja, 1989, Dictionaries for Language Generation Accounting for Co-occurrence Knowledge, Columbia University Academic Commons, http://hdl.handle.net/10022/AC:P:12088.

In Partnership with the Center for Digital Research and Scholarship at Columbia University Libraries | Terms of Use | Copyright