Clear Search sequence regions


Sizes of these terms reflect their relevance to your search.

Chemical structures of protein surfaces govern intermolecular interaction, and protein functions include specific molecular recognition, transport, self-assembly, etc. Therefore, the relationship between the chemical structure and protein functions provides insights into the understanding of the mechanism underlying protein functions and developments of new biomaterials. In this study, we analyze protein surface features, including surface amino acid populations and secondary structure ratios, instead of entire sequences as input for the classifier, intending to provide deeper insights into the determination of protein classes (cytosol or secreted). We employed a random forest-based classifier for the prediction of protein locations. Our training and testing data sets consisting of secreted and cytosol proteins were constructed using filtered information from UniProt and 3D structures from AlphaFold. The classifier achieved a testing accuracy of 93.9% with a feature importance ranking and quantitative boundary values for the top three features. We discuss the significance of these features quantitatively and the hidden rules to determine the protein classes (cytosol or secreted).

Citation

Guanghao Hu, Jooa Moon, Tomohiro Hayashi. Protein Classes Predicted by Molecular Surface Chemical Features: Machine Learning-Assisted Classification of Cytosol and Secreted Proteins. The journal of physical chemistry. B. 2024 Sep 05;128(35):8423-8436

Expand section icon Mesh Tags

Expand section icon Substances


PMID: 39185763

View Full Text