Spectrogram: A Mixture-of-Markov-Chains Model for Anomaly Detection in Web Traffic

Song, Yingbo; Keromytis, Angelos D.; Stolfo, Salvatore

We present Spectrogram, a mixture of Markov-chains sensor for anomaly detection (AD) against web-layer (port 80) code-injection attacks such as PHP file inclusion, SQL-injection, cross-site-scripting, as well as memory layer buffer overflows. Port 80 is the gateway to many application level services and a large array of attacks are channeled through this vector, servers cannot easily firewall this port. Signature-based sensors are effective in filtering known exploits but cannot detect 0-day vulnerabilities or deal with polymorphism and statistical AD approaches have mostly been limited to network layer, protocol-agnostic modeling, weakening their effectiveness. N -gram based modeling approaches have recently demonstrated success but the ill-posed nature of modeling large grams have thus far prevented exploration of higher order statistical models. In this paper, we provide a solution to this problem based on a factorization into Markov-chains and aim to model higher order structure as well as content for web requests. Spectrogram is implemented in a protocol-aware, passive, network-situated, but CGI-layered, AD architecture and we show in our evaluation that this model demonstrates significant detection results on an array of real world web-layer attacks, achieving at least 97% detection rates on all but one dataset and comparing favorably against other AD sensors.



More About This Work

Academic Units
Computer Science
Department of Computer Science, Columbia University
Columbia University Computer Science Technical Reports, CUCS-040-08
Published Here
April 26, 2011