Correlation Engine 2.0
Clear Search sequence regions

filter terms:
Sizes of these terms reflect their relevance to your search.

How protein domain structure changes in response to mutations is not well understood. Some mutations change the structure drastically, while most only result in small changes. To gain an understanding of this, we decompose the relationship between changes in domain sequence and structure using machine learning. We select pairs of evolutionarily related domains with a broad range of evolutionary distances. In contrast to earlier studies, we do not find a strictly linear relationship between sequence and structural changes. We train a random forest regressor that predicts the structural similarity between pairs with an average accuracy of 0.029 lDDT ( local Distance Difference Test) score, and a correlation coefficient of 0.92. Decomposing the feature importance shows that the domain length, or analogously, size is the most important feature. Our model enables assessing deviations in relative structural response, and thus prediction of evolutionary trajectories, in protein domains across evolution. Copyright © 2020 The Authors. Published by Elsevier Ltd.. All rights reserved.


Patrick Bryant, Arne Elofsson. Decomposing Structural Response Due to Sequence Changes in Protein Domains with Machine Learning. Journal of molecular biology. 2020 Jul 24;432(16):4435-4446

Expand section icon Mesh Tags

Expand section icon Substances

PMID: 32485208

View Full Text