A New Hybrid Coding For Protein Secondary Structure Prediction Based On Primary Structure Similarity
Keywords
Hybrid code; Protein primary structure; Protein secondary structure prediction; Support vector machine
Abstract
The coding pattern of protein can greatly affect the prediction accuracy of protein secondary structure. In this paper, a novel hybrid coding method based on the physicochemical properties of amino acids and tendency factors is proposed for the prediction of protein secondary structure. The principal component analysis (PCA) is first applied to the physicochemical properties of amino acids to construct a 3-bit-code, and then the 3 tendency factors of amino acids are calculated to generate another 3-bit-code. Two 3-bit-codes are fused to form a novel hybrid 6-bit-code. Furthermore, we make a geometry-based similarity comparison of the protein primary structure between the reference set and the test set before the secondary structure prediction. We finally use the support vector machine (SVM) to predict those amino acids which are not detected by the primary structure similarity comparison. Experimental results show that our method achieves a satisfactory improvement in accuracy in the prediction of protein secondary structure.
Publication Date
6-30-2017
Publication Title
Gene
Volume
618
Number of Pages
8-13
Document Type
Article
Personal Identifier
scopus
DOI Link
https://doi.org/10.1016/j.gene.2017.03.011
Copyright Status
Unknown
Socpus ID
85019004555 (Scopus)
Source API URL
https://api.elsevier.com/content/abstract/scopus_id/85019004555
STARS Citation
Li, Zhong; Wang, Jing; Zhang, Shunpu; Zhang, Qifeng; and Wu, Wuming, "A New Hybrid Coding For Protein Secondary Structure Prediction Based On Primary Structure Similarity" (2017). Scopus Export 2015-2019. 4764.
https://stars.library.ucf.edu/scopus2015/4764