Home

A Modern Standard Arabic Closed-Class Word List

Wael Sameer Salloum; Nizar Y. Habash

Title:
A Modern Standard Arabic Closed-Class Word List
Author(s):
Salloum, Wael Sameer
Habash, Nizar Y.
Date:
Type:
Technical reports
Department:
Center for Computational Learning Systems
Permanent URL:
Series:
CCLS Technical Report
Part Number:
CCLS-12-03
Publisher:
Center for Computational Learning Systems, Columbia University
Publisher Location:
New York
Abstract:
This document describes a list of Modern Standard Arabic closed-class words, which can be used as a stop list for a variety of natural language processing applications. The list contains 740 inflected words and clitics in the Arabic Treebank (ATB) tokenization scheme (Maamouri et al., 2004; Habash, 2010). The inflected words are based on 309 lemmas from the Standard Arabic Morphological Analyzer, SAMA (Graff et al., 2009). To get a copy of the full list, please contact the authors.
Subject(s):
Artificial intelligence
Computer science
Item views:
390
Metadata:
text | xml

In Partnership with the Center for Digital Research and Scholarship at Columbia University Libraries/Information Services | Terms of Use