ISBN: 978-981-11-3671-9 DOI: 10.18178/wcse.2017.06.233
Feature Extraction Based on Stacked Auto Encoder for Protein Secondary Structure Prediction
Abstract— In this paper, a novel sequence feature extraction method based on the deep learning network is
proposed for protein secondary structure prediction. This deep learning architecture, mainly composed of two
layers stacked auto encoder and a fully connected softmax classifier. Position-specific scoring matrix (PSSM)
profiles are used as raw data for feature extraction. The stacked auto encoder structure could learn the second
order feature parameters by the importance on massive PSSM profiles of polypeptide unaware of secondary
structure, which does improve the performance of the encoder in general. Compared to the representation of
original PSSM profiles, the extracted feature not only reflects the evolutionary information, but also the
sequence interaction of residues. Finally, the extracted features are fed into a fully connected softmax layer as
a classifier for the secondary structure prediction. The experimental results indicate that this method can
achieve an overall accuracy (Q3) above 78% on 25PDB. This is comparable with that of the art-of-the-state
PSSM+SVM methods, at the same time, in relatively short prediction period.
Index Terms— Sparse auto-encoder, Stacked auto encoder, Protein secondary structure prediction, Deep
learning neural network.
Yehong Che, Jinyong Cheng, Yihui Liu
1Institute of Intelligent Information Processing, 2School of Printing & Packaging,
Qilu University of Technology, CHINA
ISBN: 978-981-11-3671-9 DOI: 10.18178/wcse.2017.06.17Xsrc="http://www.wcse.org/uploadfile/2019/0823/20190823055609629.png" style="width: 120px; height: 68px;" />[Download]
Cite: Yehong Che, Jinyong Cheng, Yihui Liu, "Feature Extraction Based on Stacked Auto Encoder for Protein Secondary Structure Prediction," Proceedings of 2017 the 7th International Workshop on Computer Science and Engineering, pp. 1345-1352, Beijing, 25-27 June, 2017.