Presentations (Communicative Events)

Email Thread Reassembly Using Similarity Matching

Yeh, Jen-Yuan; Harnly, Aaron

Email thread reassembly is the task of linking messages by parent-child relationships. In this paper, we present two approaches to address this problem. One exploits previously undocumented header information from the Microsoft Exchange Protocol. The other uses string similarity metrics and a heuristic algorithm to reassemble threads in the absence of header information. The pros and cons of both methods are discussed. The similarity matching method is evaluated using the Enron email corpus and found to perform well.

Files

More About This Work

Academic Units
Computer Science
Publisher
Conference on Email and Anti-Spam
Published Here
July 5, 2013