1999 Presentations (Communicative Events)
Tagging French Without Lexical Probabilities - Combining Linguistic Knowledge And Statistical Learning
This paper explores morpho-syntactic ambiguities for French to develop a strategy for part-of-speech disambiguation that a) reflects the complexity of French as an inflected language, b) optimizes the estimation of probabilities, c) allows the user flexibility in choosing a tagset. The problem in extracting lexical probabilities from a limited training corpus is that the statistical model may not necessarily represent the use of a particular word in a particular context. In a highly morphologically inflected language, this argument is particularly serious since a word can be tagged with a large number of parts of speech.
Subjects
Files
- tzoukermann_al_99.pdf application/pdf 273 KB Download File
More About This Work
- Academic Units
- Computer Science
- Publisher
- Natural Language Processing Using Very Large Corpora
- Published Here
- May 3, 2013