DYSWIS: Collaborative Network Fault Diagnosis - Of End-users, By End-users, For End-users

Kim, Kyung Hwa; Singh, Vishal; Schulzrinne, Henning G.

With increase in application complexity, the need for network faults diagnosis for end-users has increased. However, existing failure diagnosis techniques fail to assist the endusers in accessing the applications and services. We present DYSWIS, an automatic network fault detection and diagnosis system for end-users. The key idea is collaboration of end-users; a node requests multiple nodes to diagnose a network fault in real time to collect diverse information from different parts of the networks and infer the cause of failure. DYSWIS leverages DHT network to search the collaborating nodes with appropriate network properties required to diagnose a failure. The framework allows dynamic updating of rules and probes into a running system. Another key aspect is contribution of expert knowledge (rules and probes) by application developers, vendors and network administrators; thereby enabling crowdsourcing of diagnosis strategy for growing set of applications. We have implemented the framework and the software and tested them using our test bed and PlanetLab to show that several complex commonly occurring failures can be detected and diagnosed successfully using DYSWIS, while single-user probe with traditional tools fails to pinpoint the cause of such failures. We validate that our base modules and rules are sufficient to detect infrastructural failures causing majority of application failures.



More About This Work

Academic Units
Computer Science
Department of Computer Science, Columbia University
Columbia University Computer Science Technical Reports, CUCS-017-11
Published Here
July 8, 2011