Why can deep convolutional neural networks improve protein fold recognition? A visual explanation by interpretation.

Yan Liu, Yi-Heng Zhu, Xiaoning Song, Jiangning Song, Dong-Jun Yu

Briefings in bioinformatics 2021 Sep 02

filter terms:

As an essential task in protein structure and function prediction, protein fold recognition has attracted increasing attention. The majority of the existing machine learning-based protein fold recognition approaches strongly rely on handcrafted features, which depict the characteristics of different protein folds; however, effective feature extraction methods still represent the bottleneck for further performance improvement of protein fold recognition. As a powerful feature extractor, deep convolutional neural network (DCNN) can automatically extract discriminative features for fold recognition without human intervention, which has demonstrated an impressive performance on protein fold recognition. Despite the encouraging progress, DCNN often acts as a black box, and as such, it is challenging for users to understand what really happens in DCNN and why it works well for protein fold recognition. In this study, we explore the intrinsic mechanism of DCNN and explain why it works for protein fold recognition using a visual explanation technique. More specifically, we first trained a VGGNet-based DCNN model, termed VGGNet-FE, which can extract fold-specific features from the predicted protein residue-residue contact map for protein fold recognition. Subsequently, based on the trained VGGNet-FE, we implemented a new contact-assisted predictor, termed VGGfold, for protein fold recognition; we then visualized what features were extracted by each of the convolutional layers in VGGNet-FE using a deconvolution technique. Furthermore, we visualized the high-level semantic information, termed fold-discriminative region, of a predicted contact map from the localization map obtained from the last convolutional layer of VGGNet-FE. It is visually confirmed that VGGNet-FE could effectively extract distinct fold-discriminative regions for different types of protein folds, thereby accounting for the improved performance of VGGfold for protein fold recognition. In summary, this study is of great significance for both understanding the working principle of DCNNs in protein fold recognition and exploring the relationship between the predicted protein contact map and protein tertiary structure. This proposed visualization method is flexible and applicable to address other DCNN-based bioinformatics and computational biology questions. The online web server of VGGfold is freely available at http://csbio.njust.edu.cn/bioinf/vggfold/. © The Author(s) 2021. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

Citation

Yan Liu, Yi-Heng Zhu, Xiaoning Song, Jiangning Song, Dong-Jun Yu. Why can deep convolutional neural networks improve protein fold recognition? A visual explanation by interpretation. Briefings in bioinformatics. 2021 Sep 02;22(5)

Mesh Tags

Substances

PMID: 33537753

View Full Text

FAQ

Why can deep convolutional neural networks improve protein fold recognition? A visual explanation by interpretation.

filter terms:

Citation

var meshTagsSectionCollapsed = true; Mesh Tags

var substancesSectionCollapsed = true; Substances

Mesh Tags

Substances