Academic Commons


On the Infeasibility of Modeling Polymorphic Shellcode for Signature Detection

Song, Yingbo; Locasto, Michael E.; Stavrou, Angelos; Keromytis, Angelos D.; Stolfo, Salvatore

POlymorphic malcode remains one of the most troubling threats for information security and intrusion defense systems. The ability for malcode to be automatically transformed into a semantically equivalent variant frustrates attempts to construct a single, simple, easily verifiable representation. We present a quantitative analysis of the strengths and limitations of shellcode polymorphism and consider the impact of this analysis on the current practices in intrusion detection. Our examination focuses on the nature of shellcode 'decoding routines', and the empirical evidence we gather illustrate our mail result: that the challenge of modeling the class of self-modifying code is likely intractable - even when the size of the instruction sequence (i.e. the decoder) is relatively small. We develop metrics to gauge the power of polymorphic engines and use them to provide insight into the strengths and weaknesses of some popular engines. We believe this analysis supplies a novel and useful way to understand the limitations of the current generation of signature-based techniques. We analyze some contemporary polymorphic techniques, explore ways to improve them in order to forecast the nature of future threats, and present our suggestions for countermeasures. Our resulsts indicate that the class of polymorphic behavior is too greatly spread and varied to model effectively. We conclude that modeling normal content is ultimately a more promising defense mechanism than modeling malicious or abnormal content.



More About This Work

Academic Units
Computer Science
Department of Computer Science, Columbia University
Columbia University Computer Science Technical Reports, CUCS-007-07
Published Here
April 28, 2011