Correlation Engine 2.0
Clear Search sequence regions


Sizes of these terms reflect their relevance to your search.

DNA molecules commonly exhibit wide interactions between the nucleobases. Modeling the interactions is important for obtaining accurate sequence-based inference. Although many deep learning methods have recently been developed for modeling DNA sequences, they still suffer from two major issues: 1) most existing methods can handle only short DNA fragments and fail to capture long-range information; 2) current methods always require massive supervised labels, which are hard to obtain in practice. We propose a new method to address both issues. Our neural network employs circular dilated convolutions as building blocks in the backbone. As a result, our network can take long DNA sequences as input without any condensation. We also incorporate the neural network into a self-supervised learning framework to capture inherent information in DNA without expensive supervised labeling. We have tested our model in two DNA inference tasks, the human variant effect and the open chromatin region of plants, where the experimental results show that our method outperforms five other deep learning models. Our code is available at https://github.com/wiedersehne/cdilDNA. Copyright © 2023. Published by Elsevier Ltd.

Citation

Lei Cheng, Tong Yu, Ruslan Khalitov, Zhirong Yang. Self-supervised Learning for DNA sequences with circular dilated convolutional networks. Neural networks : the official journal of the International Neural Network Society. 2024 Mar;171:466-473

Expand section icon Mesh Tags

Expand section icon Substances


PMID: 38150872

View Full Text