Correlation Engine 2.0
Clear Search sequence regions


Sizes of these terms reflect their relevance to your search.

Molecular property prediction faces the challenge of limited labeled data as it necessitates a series of specialized experiments to annotate target molecules. Data augmentation techniques can effectively address the issue of data scarcity. In recent years, Mixup has achieved significant success in traditional domains such as image processing. However, its application in molecular property prediction is relatively limited due to the irregular, non-Euclidean nature of graphs and the fact that minor variations in molecular structures can lead to alterations in their properties. To address these challenges, we propose a novel data augmentation method called Mix-Key tailored for molecular property prediction. Mix-Key aims to capture crucial features of molecular graphs, focusing separately on the molecular scaffolds and functional groups. By generating isomers that are relatively invariant to the scaffolds or functional groups, we effectively preserve the core information of molecules. Additionally, to capture interactive information between the scaffolds and functional groups while ensuring correlation between the original and augmented graphs, we introduce molecular fingerprint similarity and node similarity. Through these steps, Mix-Key determines the mixup ratio between the original graph and two isomers, thus generating more informative augmented molecular graphs. We extensively validate our approach on molecular datasets of different scales with several Graph Neural Network architectures. The results demonstrate that Mix-Key consistently outperforms other data augmentation methods in enhancing molecular property prediction on several datasets. © The Author(s) 2024. Published by Oxford University Press.

Citation

Tianyi Jiang, Zeyu Wang, Wenchao Yu, Jinhuan Wang, Shanqing Yu, Xiaoze Bao, Bin Wei, Qi Xuan. Mix-Key: graph mixup with key structures for molecular property prediction. Briefings in bioinformatics. 2024 Mar 27;25(3)

Expand section icon Mesh Tags


PMID: 38706318

View Full Text