header

Crispr-Embedding: CRISPR/Cas9 Off-Target Activity Prediction Using DNA k-Mer Embedding

18 Pages Posted: 31 Mar 2022 Publication Status: Under Review

See all articles by Swakkhar Shatabda

Swakkhar Shatabda

United International University

Anika Tahsin

United International University

Zarin Tasnim

United International University

Muneera Chowdhury

United International University

Kangkhita Hassin

United International University

Galib Hossain Meraz

United International University

Kazi Farzana Aziz

United International University

Abstract

In the field of gene editing, CRISPR/Cas9 has been a revolutionary new tool for biologists and researchers to work with. However, this technology has the risk of off-targets or editing at unintended sites, as they may harm normal cell functions. As such, many computational based approaches have been taken for accurate off-target prediction. Conventional feature and data handling produced  issues of data imbalance and many of the architectures are unnecessarily complex. In this paper, we have devised a deep learning model, namely CRISPR-Embedding, using a 9 layered Convolutional Neural Network (CNN) for the prediction of CRISPR/Cas9 off-targets while implementing DNA k-mer embedding for sequence representation. In addition, using data augmentation and under-sampling we produced a  substantially cleaner dataset to diffuse data imbalance  issues. Evaluating CRISPR-Embedding with 5-fold cross  validation, an average accuracy of 94.07% has been realized. Furthermore, comparison with other state-of-the-art methods has clearly showed improved off-target activity prediction.

Keywords: CRISPR/Cas9, Deep learning, Word embeddings, Off-targets

Suggested Citation

Shatabda, Swakkhar and Tahsin, Anika and Tasnim, Zarin and Chowdhury, Muneera and Hassin, Kangkhita and Meraz, Galib Hossain and Aziz, Kazi Farzana, Crispr-Embedding: CRISPR/Cas9 Off-Target Activity Prediction Using DNA k-Mer Embedding. Available at SSRN: https://ssrn.com/abstract=4071629 or http://dx.doi.org/10.2139/ssrn.4071629

Swakkhar Shatabda (Contact Author)

United International University ( email )

Madani Avenue, Dhaka, Bangladesh
Dhaka, Dhaka 1216
Bangladesh

Anika Tahsin

United International University ( email )

H-80 R-8/A Satmasjid Rd
Dhanmondi
Dhaka, 1209
Bangladesh

Zarin Tasnim

United International University ( email )

H-80 R-8/A Satmasjid Rd
Dhanmondi
Dhaka, 1209
Bangladesh

Muneera Chowdhury

United International University ( email )

H-80 R-8/A Satmasjid Rd
Dhanmondi
Dhaka, 1209
Bangladesh

Kangkhita Hassin

United International University ( email )

H-80 R-8/A Satmasjid Rd
Dhanmondi
Dhaka, 1209
Bangladesh

Galib Hossain Meraz

United International University ( email )

H-80 R-8/A Satmasjid Rd
Dhanmondi
Dhaka, 1209
Bangladesh

Kazi Farzana Aziz

United International University ( email )

H-80 R-8/A Satmasjid Rd
Dhanmondi
Dhaka, 1209
Bangladesh

Do you have negative results from your research you’d like to share?

Paper statistics

Downloads
76
Abstract Views
353
PlumX Metrics