Dynamic Adaptation of Rules for Temporal Event Correlation in Distributed Systems

Griffith, Rean; Hellerstein, Joseph L.; Diao, Yixin; Kaiser, Gail E.

Event correlation is essential to realizing self-managing distributed systems. For example, distributed systems often require that events be correlated from multiple systems using temporal patterns to detect denial of service attacks and to warn of problems with business critical applications that run on multiple servers. This paper addresses how to specify timer values for temporal patterns so as to manage the trade-off between false alarms and undetected alarms. A central concern is addressing the variability of event propagation delays due to factors such as contention for network and server resources. To this end, we develop an architecture and an adaptive control algorithm that dynamically compensate for variations in propagation delays. Our approach makes Management Stations more autonomic by avoiding the need for manual adjustments of timer values in temporal rules. Further, studies we conducted of a testbed system suggest that our approach produces results that are at least as good as an optimal fixed setting of timer values.



More About This Work

Academic Units
Computer Science
Department of Computer Science, Columbia University
Columbia University Computer Science Technical Reports, CUCS-003-05
Published Here
April 26, 2011