HomeHome

Using Machine Learning to improve Internet Privacy

Sebastian Zimmeck

Title:
Using Machine Learning to improve Internet Privacy
Author(s):
Zimmeck, Sebastian
Thesis Advisor(s):
Bellovin, Steven Michael
Date:
Type:
Dissertations
Department(s):
Computer Science
Persistent URL:
Notes:
Ph.D., Columbia University.
Abstract:
Internet privacy lacks transparency, choice, quantifiability, and accountability, especially, as the deployment of machine learning technologies becomes mainstream. However, these technologies can be both privacy-invasive as well as privacy-protective. This dissertation advances the thesis that machine learning can be used for purposes of improving Internet privacy. Starting with a case study that shows how the potential of a social network to learn ethnicity and gender of its users from geotags can be estimated, various strands of machine learning technologies to further privacy are explored. While the quantification of privacy is the subject of well-known privacy metrics, such as k-anonymity or differential privacy, I discuss how some of those metrics can be leveraged in tandem with machine learning algorithms for purposes of quantifying the privacy-invasiveness of data collection practices. Further, I demonstrate how the current notice-and-choice paradigm can be realized by automatic machine learning privacy policy analysis. The implemented system notifies users efficiently and accurately on applicable data practices. Further, by analyzing software data flows users are enabled to compare actual to described data practices and regulators can enforce those at scale. The emerging cross-device tracking practices of ad networks, analytics companies, and others can be supplemented by machine learning technologies as well to notify users of privacy practices across devices and give them the choice they are entitled to by law. Ultimately, cross-device tracking is a harbinger of the emerging Internet of Things, for which I envision intelligent personal assistants that help users navigating through the increasing complexity of privacy notices and choices.
Subject(s):
Computer science
Law
Internet--Security measures
Computer security
Machine learning
Internet--Law and legislation
Computer security--Law and legislation
Item views
34
Metadata:
text | xml
Suggested Citation:
Sebastian Zimmeck, , Using Machine Learning to improve Internet Privacy, Columbia University Academic Commons, .

Center for Digital Research and Scholarship at Columbia University Libraries | Policies