Home

Dictionaries for Language Generation Accounting for Co-occurrence Knowledge

Frank A. Smadja

Title:
Dictionaries for Language Generation Accounting for Co-occurrence Knowledge
Author(s):
Smadja, Frank A.
Date:
Type:
Technical reports
Department:
Computer Science
Permanent URL:
Series:
Columbia University Computer Science Technical Reports
Part Number:
CUCS-418-89
Publisher:
Department of Computer Science, Columbia University
Publisher Location:
New York
Abstract:
Many wording choices in English sentences cannot be accounted for on semantic or syntactic grounds. They can only be expressed in terms of co-occurrence relations. Co-occurrence knowledge has been traditionally overlooked in the past, but should be included in lexicons as it is an inherent part of the language. In this paper, we demonstrate the importance of co-occurrence knowledge for language generation and we show how to include it in computational dictionaries. Using co-occurrence knowledge in the dictionary provides the generator with the information necessary for handling many lexical decisions that were previously ignored. We focus here on the process of building the dictionary, and we show how co-occurrence knowledge can be systematically entered in lexicons. Lexical relations are first identified by a co-occurrence compiler, EXTRACT. Then, domain specific semantic information is used as a criterion for classifying them. We exemplify our approach in the banking domain and we explain how it can be used by a natural language generator.
Subject(s):
Computer science
Item views:
117
Metadata:
View

In Partnership with the Center for Digital Research and Scholarship at Columbia University Libraries/Information Services.