Motifs are the evolutionarily conserved patterns which are reported to serve the crucial structural and functional role. Identification of motif patterns in a set of protein sequences has been a prime concern for researchers in computational biology. The discovery of such a protein motif using existing algorithms is purely based on the parameters derived from sequence composition and length. However, the discovery of variable length motif remains a challenging task, as it is not possible to determine the length of a motif in advance. In current work, a k-mer based motif discovery approach called Pr[m], is proposed for the detection of the statistically significant un-gapped motif patterns, with or without wildcard characters. In order to analyze the performance of the proposed approach, a comparative study was performed with MEME and GLAM2, which are two widely used non-discriminative methods for motif discovery. A set of 7,500 test dataset were used to compare the performance of the proposed tool and the ones mentioned above. Pr[m] outperformed the existing methods in terms of predictive quality and performance. The proposed approach is hosted at[m].


Rahul Semwal, Imlimaong Aier, Utkarsh Raj, Pritish Kumar Varadwaj. Pr[m]: An Algorithm for Protein Motif Discovery. IEEE/ACM transactions on computational biology and bioinformatics. 2022 Jan-Feb;19(1):585-592

PMID: 32750855

