2006 Presentations (Communicative Events)
Email Thread Reassembly Using Similarity Matching
Email thread reassembly is the task of linking messages by parent-child relationships. In this paper, we present two approaches to address this problem. One exploits previously undocumented header information from the Microsoft Exchange Protocol. The other uses string similarity metrics and a heuristic algorithm to reassemble threads in the absence of header information. The pros and cons of both methods are discussed. The similarity matching method is evaluated using the Enron email corpus and found to perform well.
Subjects
Files
- yeh_harnly_06.pdf application/pdf 515 KB Download File
More About This Work
- Academic Units
- Computer Science
- Publisher
- Conference on Email and Anti-Spam
- Published Here
- July 5, 2013