File(s) under permanent embargo
Real value solvent accessibility prediction using adaptive support vector regression
conference contribution
posted on 2007-01-01, 00:00 authored by J Gubbi, Alistair ShiltonAlistair Shilton, M Palaniswami, M ParkerKnowledge of the secondary structure and solvent accessibility of a protein plays a vital role in prediction of fold, and eventually the tertiary structure of the protein. This paper deals with prediction of relative solvent accessibility, given only the amino-acid sequence. In this paper, we use an improved support vector regression (SVR) and new kernels for real valued prediction of solvent accessibility. In this regard, two main issues are addressed. First we address the problem of c selection, which we found to be somewhat problematic in our earlier work (c is a parameter with significant influence on noise insensitivity and generalization of SVRs). In particular, rather than employ the standard trial and error based approach, we used an improved tube shrinking method to find c. Secondly, a novel kernel combining solvation model, electrostatic charge model and evolutionary information in the form of position specific scoring matrix (PSSM) is given. A new dataset of 472 proteins with less than 20% sequence identity is curated and used to evaluate the result. To make a more objective comparison with earlier methods, we use a standard dataset and show that the proposed scheme is better than the ones normally used in literature. We also report a lowest mean absolute error (MAE) so far of 0.12 on the standard dataset.
History
Event
IEEE Computational Intelligence Society. Symposium (2007 : Honolulu, Hawaii)Series
IEEE Computational Intelligence Society SymposiumPagination
395 - 401Publisher
Institute of Electrical and Electronics EngineersLocation
Honolulu, HawaiiPlace of publication
Piscataway, N.J.Publisher DOI
Start date
2007-04-01End date
2007-04-05ISBN-13
9781424407101ISBN-10
1424407109Language
engPublication classification
E1.1 Full written paper - refereedEditor/Contributor(s)
[Unknown]Title of proceedings
CIBCB 2007 : Proceedings of the 2007 IEEE Symposium on Computational Intelligence and Bioinformatics and Computational BiologyUsage metrics
Categories
No categories selectedKeywords
solventsproteinssupport vector machinessupport vector machine classificationneural networkssequenceskernelfeedforward systemsmulti-layer neural networkfeedforward neural networksScience & TechnologyTechnologyLife Sciences & BiomedicineComputer Science, Interdisciplinary ApplicationsEngineering, Electrical & ElectronicMathematical & Computational BiologyComputer ScienceEngineeringPROTEIN SECONDARY STRUCTURESURFACE-AREASIMPROVEMENTMACHINES
Licence
Exports
RefWorks
BibTeX
Ref. manager
Endnote
DataCite
NLM
DC