Microsatellite Diversity, Complexity, and Host Range of Mycobacteriophage Genomes of the Siphoviridae Family

Alam, Chaudhary Mashhood and Iqbal, Asif and Sharma, Anjana and Schulman, Alan H. and Ali, Safdar (2019) Microsatellite Diversity, Complexity, and Host Range of Mycobacteriophage Genomes of the Siphoviridae Family. Frontiers in Genetics, 10. ISSN 1664-8021

[thumbnail of pubmed-zip/versions/1/package-entries/fgene-10-00207/fgene-10-00207.pdf] Text
pubmed-zip/versions/1/package-entries/fgene-10-00207/fgene-10-00207.pdf - Published Version

Download (4MB)

Abstract

The incidence, distribution, and variation of simple sequence repeats (SSRs) in viruses is instrumental in understanding the functional and evolutionary aspects of repeat sequences. Full-length genome sequences retrieved from NCBI were used for extraction and analysis of repeat sequences using IMEx software. We have also developed two MATLAB-based tools for extraction of gene locations from GenBank in tabular format and simulation of this data with SSR incidence data. Present study encompassing 147 Mycobacteriophage genomes revealed 25,284 SSRs and 1,127 compound SSRs (cSSRs) through IMEx. Mono- to hexa-nucleotide motifs were present. The SSR count per genome ranged from 78 (M100) to 342 (M58) while cSSRs incidence ranged from 1 (M138) to 17 (M28, M73). Though cSSRs were present in all the genomes, their frequency and SSR to cSSR conversion percentage varied from 1.08 (M138 with 93 SSRs) to 8.33 (M116 with 96 SSRs). In terms of localization, the SSRs were predominantly localized to coding regions (∼78%). Interestingly, genomes of around 50 kb contained a similar number of SSRs/cSSRs to that in a 110 kb genome, suggesting functional relevance for SSRs which was substantiated by variation in motif constitution between species with different host range. The three species with broad host range (M97, M100, M116) have around 90% of their mono-nucleotide repeat motifs composed of G or C and only M16 has both A and T mononucleotide motifs. Around 20% of the di-nucleotide repeat motifs in the genomes exhibiting a broad host range were CT/TC, which were either absent or represented to a much lesser extent in the other genomes.

Item Type: Article
Subjects: Academics Guard > Medical Science
Depositing User: Unnamed user with email support@academicsguard.com
Date Deposited: 22 Feb 2023 10:56
Last Modified: 14 Sep 2024 04:47
URI: http://science.oadigitallibraries.com/id/eprint/214

Actions (login required)

View Item
View Item