1988 Reports
A Survey of Software Fault Tolerance Techniques
This report examines the state of the field of software fault tolerance. Terminology, techniques for building reliable systems, and fault tolerance are discussed. While a scientific consensus on the measurement of software reliability has not been reached, software systems are sufficiently pervasive that “software“ components of larger systems must be reliable, since dependence is placed on them. Fault tolerant systems utilize redundant components to mitigate the effects of component failures, and thus create a system which is more reliable than a single component. This idea can be applied to software systems as well. Several techniques for designing fault tolerant software systems are discussed and assessed qualitatively, where "software fault" refers to what is more commonly known as a bug. The assumptions, relative merits, available experimental results, and implementation experience are discussed for each technique. This leads us to some conclusions about the state of the field.
Subjects
Files
- CUCS-325-88.pdf application/pdf 872 KB Download File
More About This Work
- Academic Units
- Computer Science
- Publisher
- Department of Computer Science, Columbia University
- Series
- Columbia University Computer Science Technical Reports, CUCS-325-88
- Published Here
- December 7, 2011