Correlation Engine 2.0
Clear Search sequence regions

We employed segmented principal component analysis and regression, as a new methodology in quantitative structure-activity relationship (QSAR), to define new amino acid indices. The descriptors are first classified into different groups (based on the similarity in the information contents they are possessing) and then each group is subjected to principal component analysis (PCA), separately. The extracted principal components (PC) from the descriptor data matrix of each group can be considered as new sources of amino acid indices. These indices were used as input variables for QSAR study of two dipeptide data sets (58 angiotensin-converting enzyme (ACE) inhibitor activity, and 48 bitter tasting threshold (BTT) activity). Modeling between the indices and biological activity was achieved utilizing segmented principal component regression (SPCR) and segmented partial least squares (SPLS) methods. Both methods resulted in reliable QSAR models. In comparison with conventional principal component regression (PCR) and partial least square (PLS), the segmented ones produced more predictive models. In addition, the developed models showed better performances with respect to the previously reported models for the same data sets. It can be concluded that by segmentation of variables and partitioning of the information into informative and redundant parts, it is possible to discard the redundant part of variables and to obtain more appropriate models. Copyright © 2012 Elsevier Ltd. All rights reserved.


Bahram Hemmateenejad, Ramin Miri, Maryam Elyasi. A segmented principal component analysis--regression approach to QSAR study of peptides. Journal of theoretical biology. 2012 Jul 21;305:37-44

Expand section icon Mesh Tags

Expand section icon Substances

PMID: 22575548

View Full Text