Reports

CATiB: The Columbia Arabic Treebank

Habash, Nizar; Roth, Ryan; Habash, Nizar Y.; Roth, Ryan M.

The Columbia Arabic Treebank (CATiB) is a resource for Arabic parsing. CATiB contrasts with previous efforts on Arabic treebanking and treebanking of morphologically rich languages in that it encodes less linguistic information in the interest of speedier annotation of large amounts of text. This paper describes CATiB's representation and annotation procedure, and reports on achieved inter-annotator agreement and annotation speed.

Files

More About This Work

Academic Units
Center for Computational Learning Systems
Publisher
Center for Computational Learning Systems, Columbia University
Series
CCLS Technical Report, CCLS-09-01
Published Here
November 22, 2010