Correlation Engine 2.0
Clear Search sequence regions


filter terms:
  • amino acid (5)
  • lipocalin (11)
  • Sizes of these terms reflect their relevance to your search.

    Lipocalin belongs to the calcyin family, and its sequence length is generally between 165 and 200 residues. They are mainly stable and multifunctional extracellular proteins. Lipocalin plays an important role in several stress responses and allergic inflammations. Because the accurate identification of lipocalins could provide significant evidences for the study of their function, it is necessary to develop a machine learning-based model to recognize lipocalin. In this study, we constructed a prediction model to identify lipocalin. Their sequences were encoded by six types of features, namely amino acid composition (AAC), composition of k-spaced amino acid pairs (CKSAAP), pseudo amino acid composition (PseAAC), Geary correlation (GD), normalized Moreau-Broto autocorrelation (NMBroto) and composition/transition/distribution (CTD). Subsequently, these features were optimized by using feature selection techniques. A classifier based on random forest was trained according to the optimal features. The results of 10-fold cross-validation showed that our computational model would classify lipocalins with accuracy of 95.03% and area under the curve of 0.987. On the independent dataset, our computational model could produce the accuracy of 89.90% which was 4.17% higher than the existing model. In this work, we developed an advanced computational model to discriminate lipocalin proteins from non-lipocalin proteins. In the proposed model, protein sequences were encoded by six descriptors. Then, feature selection was performed to pick out the best features which could produce the maximum accuracy. On the basis of the best feature subset, the RF-based classifier can obtained the best prediction results. © 2022 The Author(s). Published by IMR Press.

    Citation

    Hasan Zulfiqar, Zahoor Ahmed, Cai-Yi Ma, Rida Sarwar Khan, Bakanina Kissanga Grace-Mercure, Xiao-Long Yu, Zhao-Yue Zhang. Comprehensive Prediction of Lipocalin Proteins Using Artificial Intelligence Strategy. Frontiers in bioscience (Landmark edition). 2022 Mar 05;27(3):84

    Expand section icon Mesh Tags

    Expand section icon Substances


    PMID: 35345316

    View Full Text