An Event System Architecture for Scaling Scale-Resistant Services

Gross, Philip N.

Large organizations are deploying ever-increasing numbers of networked compute devices, from utilities installing smart controllers on electricity distribution cables, to the military giving PDAs to soldiers, to corporations putting PCs on the desks of employees. These computers are often far more capable than is needed to accomplish their primary task, whether it be guarding a circuit breaker, displaying a map, or running a word processor. These devices would be far more useful if they had some awareness of the world around them: a controller that resists tripping a switch, knowing that it would set off a cascade failure, a PDA that warns its owner of imminent danger, a PC that exchanges reports of suspicious network activity to its peers to identify stealthy computer crackers. In order to provide these higher-level services, the devices need a model of their environment. The controller needs a model of the distribution grid, the PDA needs a model of the battlespace, and the PC needs a model of the network and of normal network and user behavior. Unfortunately, not only might models such as these require substantial computational resources, but generating and updating them is even more demanding. Modelbuilding algorithms tend to be bad in three ways: requiring large amounts of CPU and memory to run, needing large amounts of data from the outside to stay up to date, and running so slowly that can't keep up with any fast changes in the environment that might occur. We can solve these problems by reducing the scope of the model to the immediate locale of the device, since reducing the size of the model makes the problem of model generation much more tractable. But such models are also much less useful, having no knowledge of the wider system. This thesis proposes a better solution to this problem called Level of Detail, after the computer graphics technique of the same name. Instead of simplifying the representation of distant objects, however, we simplify less-important data. Compute devices in the system receive streams of data that is a mixture of detailed data from devices that directly affect them and data summaries (aggregated data) from less directly influential devices. The degree to which the data is aggregated (i.e., how much it is reduced) is determined by calculating an influence metric between the target device and the remote device. The smart controller thus receives a continuous stream of raw data from the adjacent transformer, but only an occasional small status report summarizing all the equipment in a neighborhood in another part of the city. This thesis describes the data distribution system, the aggregation functions, and the influence metrics that can be used to implement such a system. I also describe my current towards establishing a test environment and validating the concepts, and describe the next steps in the research plan.



More About This Work

Academic Units
Computer Science
Department of Computer Science, Columbia University
Columbia University Computer Science Technical Reports, CUCS-058-05
Published Here
April 27, 2011