Home

Markov Models for Network-Behavior Modeling and Anonymization

Yingbo Song; Salvatore Stolfo; Tony Jebara

Title:
Markov Models for Network-Behavior Modeling and Anonymization
Author(s):
Song, Yingbo
Stolfo, Salvatore
Jebara, Tony
Date:
Type:
Technical reports
Department:
Computer Science
Permanent URL:
Series:
Columbia University Computer Science Technical Reports
Part Number:
CUCS-029-11
Publisher:
Department of Computer Science, Columbia University
Publisher Location:
New York
Abstract:
Modern network security research has demonstrated a clear need for open sharing of traffic datasets between organizations, a need that has so far been superseded by the challenge of removing sensitive content beforehand. Network Data Anonymization (NDA) is emerging as a field dedicated to this problem, with its main direction focusing on removal of identifiable artifacts that might pierce privacy, such as usernames and IP addresses. However, recent research has demonstrated that more subtle statistical artifacts, also present, may yield fingerprints that are just as differentiable as the former. This result highlights certain shortcomings in current anonymization frameworks -- particularly, ignoring the behavioral idiosyncrasies of network protocols, applications, and users. Recent anonymization results have shown that the extent to which utility and privacy can be obtained is mainly a function of the information in the data that one is aware and not aware of. This paper leverages the predictability of network behavior in our favor to augment existing frameworks through a new machine-learning-driven anonymization technique. Our approach uses the substitution of individual identities with group identities where members are divided based on behavioral similarities, essentially providing anonymity-by-crowds in a statistical mix-net. We derive time-series models for network traffic behavior which quantifiably models the discriminative features of network "behavior" and introduce a kernel-based framework for anonymity which fits together naturally with network-data modeling.
Subject(s):
Computer science
Item views:
294
Metadata:
text | xml

In Partnership with the Center for Digital Research and Scholarship at Columbia University Libraries/Information Services | Terms of Use