Academic Commons

Reports

Improving the Quality of Computational Science Software by Using Metamorphic Relations to Test Machine Learning Applications

Xie, Xiaoyuan; Ho, Joshua; Murphy, Christian; Kaiser, Gail E.; Xu, Baowen; Chen, Tsong Yueh

Many applications in the field of scientific computing - such as computational biology, computational linguistics, and others - depend on Machine Learning algorithms to provide important core functionality to support solutions in the particular problem domains. However, it is difficult to test such applications because often there is no 'test oracle' to indicate what the correct output should be for arbitrary input. To help address the quality of scientific computing software, in this paper we present a technique for testing the implementations of machine learning classification algorithms on which such scientific computing software depends. Our technique is based on an approach called 'metamorphic testing', which has been shown to be effective in such cases. In addition to presenting our technique, we describe a case study we performed on a real-world machine learning application framework, and discuss how programmers implementing machine learning algorithms can avoid the common pitfalls discovered in our study. We also discuss how our findings can be of use to other areas of computational science and engineering.

Files

More About This Work

Academic Units
Computer Science
Publisher
Department of Computer Science, Columbia University
Series
Columbia University Computer Science Technical Reports, CUCS-004-09
Published Here
April 26, 2011