2003 Reports
Newsblaster Russian-English Clustering Performance Analysis
The Natural Language Group is developing a multi-language version of Columbia Newsblaster, a program that generates summaries of news articles collected from web sites. Newsblaster currently processes articles in Arabic, Japanese,Portuguese, Spanish, and Russian, as well as English. This report outlines the Russian language processing software,focusing on machine translation and document clustering. Russian-English clustering results are analyzed and indicate encouraging inter-language and intra-language performance.
Subjects
Files
-
demo title for ac:109658 application/octet-stream 145 KB Download File
More About This Work
- Academic Units
- Computer Science
- Series
- Columbia University Computer Science Technical Reports, 41
- Published Here
- August 26, 2009