TITLE:
Similarity Studies of Corona Viruses through Chaos Game Representation
AUTHORS:
Dipendra C. Sengupta, Matthew D. Hill, Kevin R. Benton, Hirendra N. Banerjee
KEYWORDS:
Covid-19, Chaos Game Representation, Deoxyribonucleic Acid, Phylogenetic Analysis, Shannon Entropy
JOURNAL NAME:
Computational Molecular Bioscience,
Vol.10 No.3,
June
29,
2020
ABSTRACT: The novel coronavirus (SARS-COV-2) is generally referred to as Covid-19 virus has spread to 213 countries with nearly 7 million confirmed cases and nearly 400,000 deaths. Such major outbreaks demand classification and origin of the virus genomic sequence, for planning, containment, and treatment. Motivated by the above need, we report two alignment-free methods combing with CGR to perform clustering analysis and create a phylogenetic tree based on it. To each DNA sequence we associate a matrix then define distance between two DNA sequences to be the distance between their associated matrix. These methods are being used for phylogenetic analysis of coronavirus sequences. Our approach provides a powerful tool for analyzing and annotating genomes and their phylogenetic relationships. We also compare our tool to ClustalX algorithm which is one of the most popular alignment methods. Our alignment-free methods are shown to be capable of finding closest genetic relatives of coronaviruses.