Genetic Algorithm for Whole Genome Sequence Percentage Similarity
4 Pages Posted: 2 Apr 2020
Date Written: April 1, 2020
Abstract
The chromosomes have varied genes however; genes are often vulnerable to re-arrangement. When DNA sequences of homologous species are subjected to BLAST, they are aligned and gaps are subsequently added whenever there is a mismatch. But, the percentage similarity cannot be concluded when there is gene re-arrangement and hence, the percentage similarity needs to be retrieved by comparing the DNA sequences position to position to identify the gene re-arrangements and an authentic percentage similarity.
Therefore, an algorithm has been developed in Java programming language which can retrieve the sequence similarity by position to position comparison without inserting any gaps thereby, making the algorithm authentic as it results out the raw percentage of sequence similarity. The authenticities have been further tested by retrieving the signature sequence belonging to one of the two DNA sequences and by further modifying the given algorithm and also by comparing the results with BLAST results.
Keywords: gene re-arrangement, algorithm, whole genome sequence similarity, gaps, signature sequence
Suggested Citation: Suggested Citation