
Full text loading...
Category: Bacterial Pathogenesis
Deep and Wide: Comparative Genomics of Chlamydia, Page 1 of 2
< Previous page | Next page > /docserver/preview/fulltext/10.1128/9781555817329/9781555816742_Chap02-1.gif /docserver/preview/fulltext/10.1128/9781555817329/9781555816742_Chap02-2.gifAbstract:
This chapter focuses on comparative genomics of Chlamydia species, which are well-known human and animal pathogens that constitute the family Chlamydiaceae. There were 33 chlamydial genome sequences produced by whole-genome shotgun (WGS) sequencing that are publicly available as complete or draft genomes in GenBank. The Chlamydiaceae pangenome was first examined by a BLAST score ratio (BSR) analysis using a single representative of each of the eight sequenced species. The secreted effectors identified in C. trachomatis are more divergent in other species than are the other selected virulence factors. One of the most significant features of the chlamydial plasticity zone (PZ) revealed by comparative analysis is the heterogeneity of the chlamydial cytotoxin (tox) across the Chlamydiaceae. Nevertheless, the tox variation identified here is a prime example of the utility of comparative genomics for identifying novel interspecies genetic variation for further exploration. The chapter focuses further attention on the processes of gene decay and loss that are shaping the Chlamydiaceae. The symptoms of human disease are variable, ranging from no clinical signs at all to severe systemic disease. C. trachomatis is represented by 20 of the 33 genomes currently available, as befitting the most recognizable pathogen of the family Chlamydiaceae, responsible for significant sexually transmitted disease morbidity and infectious blindness worldwide.
Full text loading...
Blast Score Ratio similarity profile of selected known virulence factors across the Chlamydiaceae. The BSR for each protein was scored from 0.0 (black; least similar) to 1.0 (red; most similar) and hierarchically clustered by species (Pearson correlation with average linkage). doi:10.1128/9781555817329.ch2.f1
Blast Score Ratio similarity profile of selected known virulence factors across the Chlamydiaceae. The BSR for each protein was scored from 0.0 (black; least similar) to 1.0 (red; most similar) and hierarchically clustered by species (Pearson correlation with average linkage). doi:10.1128/9781555817329.ch2.f1
Comparison of the chlamydial plasticity zone across nine representative genomes. Regions are ordered by size, with selected coding sequences highlighted in color. Asterisks denote selected genes with evidence of truncation or decay. doi:10.1128/9781555817329.ch2.f2
Comparison of the chlamydial plasticity zone across nine representative genomes. Regions are ordered by size, with selected coding sequences highlighted in color. Asterisks denote selected genes with evidence of truncation or decay. doi:10.1128/9781555817329.ch2.f2
sSNP phylogenetic tree using all sequenced C. pneumoniae genomes. The number of separating synonymous SNPs (sSNPs) is given on each branch. High-quality sSNPs were identified by comparing the predicted genes on the closed genome of C. pneumoniae strain AR39 with the LPCoLN genome sequence. A polymorphic site was considered of high quality when its underlying sequence comprised at least three sequencing reads with an average Phred quality score greater than 30. sSNPs in CWL029, TW183, and J138 were similarly identified, although no assessment of quality could be made, as quality scores are not available for these genomes. Concatenated sSNPs for the individual C. pneumoniae isolates were further analyzed by the HKY85 method with 200 bootstrap replicates, and the results were used to generate an unrooted phylogenetic tree according to the PhyLM algorithms. doi:10.1128/9781555817329.ch2.f3
sSNP phylogenetic tree using all sequenced C. pneumoniae genomes. The number of separating synonymous SNPs (sSNPs) is given on each branch. High-quality sSNPs were identified by comparing the predicted genes on the closed genome of C. pneumoniae strain AR39 with the LPCoLN genome sequence. A polymorphic site was considered of high quality when its underlying sequence comprised at least three sequencing reads with an average Phred quality score greater than 30. sSNPs in CWL029, TW183, and J138 were similarly identified, although no assessment of quality could be made, as quality scores are not available for these genomes. Concatenated sSNPs for the individual C. pneumoniae isolates were further analyzed by the HKY85 method with 200 bootstrap replicates, and the results were used to generate an unrooted phylogenetic tree according to the PhyLM algorithms. doi:10.1128/9781555817329.ch2.f3
Comparison of two regions of SNP accumulation in C. pneumoniae, with SNP location and type (synonymous, green; nonsynonymous, red). Grey highlighting shows SNP-associated CDS fragmentation. (A) The plasticity zone. LPCoLN gene region, ORF00689 to ORF00665; AR39 gene region, CP_0585 to CP_0622. (B) Pmp cluster. LPCoLN gene region, ORF00989 to ORF00956; AR39 gene region, CP_0280 to CP_0309. doi:10.1128/9781555817329.ch2.f4
Comparison of two regions of SNP accumulation in C. pneumoniae, with SNP location and type (synonymous, green; nonsynonymous, red). Grey highlighting shows SNP-associated CDS fragmentation. (A) The plasticity zone. LPCoLN gene region, ORF00689 to ORF00665; AR39 gene region, CP_0585 to CP_0622. (B) Pmp cluster. LPCoLN gene region, ORF00989 to ORF00956; AR39 gene region, CP_0280 to CP_0309. doi:10.1128/9781555817329.ch2.f4
Circular representation of C. trachomatis proteomic similarity across 19 genomes, relative to C. trachomatis D, showing hot spots of gene variability. Data are from outermost circle to innermost. In the first two outermost circles, black tick marks represent predicted CDSs on the plus strand of C. trachomatis 6BC and the minus strand, respectively. The following two circles plot %GC and GC skew as histograms. The following circle plots the positions of proteins that are present and highly conserved (red; BSR, >0.8) across all genomes. Each subsequent circle shows the positions of variable or unique proteins only for each genome as labeled. Color coding is as follows: purple, the protein is present in <19 genomes (including the reference); green, protein is present in ≤10 genomes; blue, protein is present in ≤5 genomes; orange, protein is present only in the reference; grey, protein is absent in the reference genome. C. trachomatis strains, from the outermost circle moving toward the center, are as follows: A-HAR-13, B-TZA828OT, B-Jali20OT, D-EC, D-LC, D-s2923, E-150, E-1103,?G-11074, G-9301, G-11222, G-9768, L2b-UCH-1proctitis, Sweden2, 434Bu, 6276, 6276s, 70, and 70s. doi:10.1128/9781555817329.ch2.f5
Circular representation of C. trachomatis proteomic similarity across 19 genomes, relative to C. trachomatis D, showing hot spots of gene variability. Data are from outermost circle to innermost. In the first two outermost circles, black tick marks represent predicted CDSs on the plus strand of C. trachomatis 6BC and the minus strand, respectively. The following two circles plot %GC and GC skew as histograms. The following circle plots the positions of proteins that are present and highly conserved (red; BSR, >0.8) across all genomes. Each subsequent circle shows the positions of variable or unique proteins only for each genome as labeled. Color coding is as follows: purple, the protein is present in <19 genomes (including the reference); green, protein is present in ≤10 genomes; blue, protein is present in ≤5 genomes; orange, protein is present only in the reference; grey, protein is absent in the reference genome. C. trachomatis strains, from the outermost circle moving toward the center, are as follows: A-HAR-13, B-TZA828OT, B-Jali20OT, D-EC, D-LC, D-s2923, E-150, E-1103,?G-11074, G-9301, G-11222, G-9768, L2b-UCH-1proctitis, Sweden2, 434Bu, 6276, 6276s, 70, and 70s. doi:10.1128/9781555817329.ch2.f5
Key parameters of selected second- and third-generation sequencing technologies, compared to first-generation Sanger sequencing
Key parameters of selected second- and third-generation sequencing technologies, compared to first-generation Sanger sequencing
Genome features of current publicly available Chlamydiaceae genomes
Genome features of current publicly available Chlamydiaceae genomes
Pangenome analysis of the Chlamydiaceae across representative species of all sequenced Chlamydiaceae and within each Chlamydiaceae species that is represented by multiple genome sequences a
Pangenome analysis of the Chlamydiaceae across representative species of all sequenced Chlamydiaceae and within each Chlamydiaceae species that is represented by multiple genome sequences a
Breakdown of SNP/indels per Chlamydiaceae genome, where multiple genomes are available a
Breakdown of SNP/indels per Chlamydiaceae genome, where multiple genomes are available a