Technical reports:
A Modern Standard Arabic Closed-Class Word List
Wael Sameer Salloum; Nizar Y. Habash
Downloads:
- Title:
- A Modern Standard Arabic Closed-Class Word List
- Author(s):
-
Salloum, Wael Sameer
Habash, Nizar Y. - Date:
- 2012
- Type:
- Technical reports
- Department:
- Center for Computational Learning Systems
- Permanent URL:
- http://hdl.handle.net/10022/AC:P:14023
- Series:
- CCLS Technical Report
- Part Number:
- CCLS-12-03
- Publisher:
- Center for Computational Learning Systems, Columbia University
- Publisher Location:
- New York
- Abstract:
- This document describes a list of Modern Standard Arabic closed-class words, which can be used as a stop list for a variety of natural language processing applications. The list contains 740 inflected words and clitics in the Arabic Treebank (ATB) tokenization scheme (Maamouri et al., 2004; Habash, 2010). The inflected words are based on 309 lemmas from the Standard Arabic Morphological Analyzer, SAMA (Graff et al., 2009). To get a copy of the full list, please contact the authors.
- Subject(s):
-
Artificial intelligence
Computer science
- Item views:
- 123