1993 Reports
crep: a regular expression-matching textual corpus tool
Crep is a UNIX2 tool which searches either a tagged or free textual corpus file and outputs each sentence that matches the specified regular expression provided by the user as a parameter. The expression consists of user-defined regular expressions of words and/or part-of-speech tags. The purpose of crep is to make the searches faster and easier than by either a) searching through corpora by hand; or b) constructing a lexical scanner for each specific search. crep achieves this facilitation by offering the user a simple expression syntax, from which it automatically constructs an appropriate scanner. The user therefore has the ability to execute a whole search in one command, invoking implicitly and explicitly several tools, including a sentence delimiter, a part of speech tagger (developed by Ken Church at AT&T Bell Laboratories), and various output filters.
Subjects
Files
-
cucs-005-93.pdf application/pdf 168 KB Download File
More About This Work
- Academic Units
- Computer Science
- Publisher
- Department of Computer Science, Columbia University
- Series
- Columbia University Computer Science Technical Reports, CUCS-005-93
- Published Here
- January 20, 2012