Technical reports:
Combining Pairwise Sequence Similarity and Support Vector Machines for Remote Protein Homology Detection
Li Liao; William Stafford Noble
Downloads:
- Title:
- Combining Pairwise Sequence Similarity and Support Vector Machines for Remote Protein Homology Detection
- Author(s):
-
Liao, Li
Noble, William Stafford - Date:
- 2001
- Type:
- Technical reports
- Department:
- Computer Science
- Permanent URL:
- http://hdl.handle.net/10022/AC:P:29266
- Series:
- Columbia University Computer Science Technical Reports
- Part Number:
- CUCS-012-01
- Publisher:
- Department of Computer Science, Columbia University
- Publisher Location:
- New York
- Abstract:
- One key element in understanding the molecular machinery of the cell is to understand the meaning, or function, of each protein encoded in the genome. A very successful means of inferring the function of a previously unannotated protein is via sequence similarity with one or more proteins whose functions are already known. Currently, one of the most powerful such homology detection methods is the SVM-Fisher method of Jaakkola, Diekhans and Haussler (ISMB 2000). This method combines a generative, profile hidden Markov model (HMM) with a discriminative classification algorithm known as a support vector machine (SVM). The current work presents an alternative method for SVM-based protein classification. The method, SVM-pairwise, uses a pairwise sequence similarity algorithm such as Smith-Waterman in place of the HMM in the SVM-Fisher method. The resulting algorithm, when tested on its ability to recognize previously unseen families from the SCOP database, yields significantly better remote protein homology detection than SVM-Fisher, profile HMMs and PSI-BLAST.
- Subject(s):
- Computer science
- Item views:
- 146