Academic Commons

Articles

Pathosphere.org: pathogen detection and characterization through a web-based, open source informatics platform

Kilianski, Andy; Yao, Shijie; Carcel, Patrick; Roth, Pierce; Schulte, Josh; Donarum, Greg B.; Fochler, Ed T.; Hill, Jessica M.; Liem, Alvin T.; Wiley, Michael R.; Ladner, Jason T.; Pfeffer, Bradley P.; Elliot, Oliver T.; Petrosov, Alexandra; Jima, Dereje D.; Vallard, Tyghe G.; Melendrez, Melanie C.; Skowronski, Evan; Quan, Phenix-Lan; Lipkin, Ian W.; Gibbons, Henry S.; Hirschberg, David L.; Palacios, Gustavo F.; Rosenzweig, C. Nicole

Background
The detection of pathogens in complex sample backgrounds has been revolutionized by wide access to next-generation sequencing (NGS) platforms. However, analytical methods to support NGS platforms are not as uniformly available. Pathosphere (found at Pathosphere.org) is a cloud - based open - sourced community tool that allows for communication, collaboration and sharing of NGS analytical tools and data amongst scientists working in academia, industry and government. The architecture allows for users to upload data and run available bioinformatics pipelines without the need for onsite processing hardware or technical support.

Results
The pathogen detection capabilities hosted on Pathosphere were tested by analyzing pathogen-containing samples sequenced by NGS with both spiked human samples as well as human and zoonotic host backgrounds. Pathosphere analytical pipelines developed by Edgewood Chemical Biological Center (ECBC) identified spiked pathogens within a common sample analyzed by 454, Ion Torrent, and Illumina sequencing platforms. ECBC pipelines also correctly identified pathogens in human samples containing arenavirus in addition to animal samples containing flavivirus and coronavirus. These analytical methods were limited in the detection of sequences with limited homology to previous annotations within NCBI databases, such as parvovirus. Utilizing the pipeline-hosting adaptability of Pathosphere, the analytical suite was supplemented by analytical pipelines designed by the United States Army Medical Research Insititute of Infectious Diseases and Walter Reed Army Institute of Research (USAMRIID-WRAIR). These pipelines were implemented and detected parvovirus sequence in the sample that the ECBC iterative analysis previously failed to identify.

Conclusions
By accurately detecting pathogens in a variety of samples, this work demonstrates the utility of Pathosphere and provides a platform for utilizing, modifying and creating pipelines for a variety of NGS technologies developed to detect pathogens in complex sample backgrounds. These results serve as an exhibition for the existing pipelines and web-based interface of Pathosphere as well as the plug-in adaptability that allows for integration of newer NGS analytical software as it becomes available.

Files

  • thumnail for 12859_2015_Article_840.pdf 12859_2015_Article_840.pdf binary/octet-stream 814 KB Download File

Also Published In

Title
BMC Bioinformatics
DOI
https://doi.org/10.1186/s12859-015-0840-5

More About This Work

Academic Units
Biomedical Informatics
Epidemiology
Published Here
February 6, 2017
Academic Commons provides global access to research and scholarship produced at Columbia University, Barnard College, Teachers College, Union Theological Seminary and Jewish Theological Seminary. Academic Commons is managed by the Columbia University Libraries.