The aims of this chapter are to present the current status of the genome project, to summarize the main findings of that project, and to discuss its potential impact on future tuberculosis research. The chromosomal inserts carried by the latter clones can be excised by digestion with and are destined for systematic DNA sequence analysis. The rapid mapping of genes is achieved by hybridization of specific probes to cosmid grids, and since the distribution of cosmid clones within a contig is relatively uniform, genes can usually be positioned with a precision of 10 to 20 kb. Sequences are considered finished when both strands have been completely sequenced at an indel error rate of about 1/5,000 and when the sequence has been analyzed for genes and open reading frames (ORFs). Finished sequences are submitted electronically to GenBank and a mycobacterial mapping and sequence database, MycDB, based on the ACEDB software. Analysis of the TBC2 sequence revealed 23 putative genes, including 14 encoding polypeptides with homologies to other known proteins and 9 ORFs. The enzyme is highly homologous to PUR3. A putative transport protein, encoded by , shows weak homology to the ArsB proton pump, P-glycoprotein, and other transport proteins.

Citation: Cole S, Smith D. 1994. Toward Mapping and Sequencing the Genome of , p 227-238. In Bloom B (ed), Tuberculosis. ASM Press, Washington, DC. doi: 10.1128/9781555818357.ch16
Image of Figure 1
Figure 1

Integrated approach to mycobacterial genome mapping. The steps and approaches being used in analysis of the chromosome are shown in the form of a flow diagram. Genomic DNA is used for either PFGE analysis or cosmid cloning. The integrated map combines data from physical, contig, and gene mapping, and this map together with DNA sequences is stored in MycDB, a database dedicated to mycobacterial research.

Image of Figure 2
Figure 2

PFGE analysis of H37Rv and Ra. (A) DNA in agarose plugs from H37Ra was digested with I (lane 2) or I (lane 3), while H37Rv DNA was digested with I (lane 4) and separated by field inversion gel electrophoresis as described previously ( ). The agarose gel (1% in 0.66x Trisborate-EDTA [TBE]) was calibrated with yeast chromosomes (lane 1) or λ concatamers (not shown) and run for 48 h at constant voltage (240 V). (B) Similar DNA samples from H37Rv were digested with I (lane 2) and I (lane 3), and H37Ra DNA was digested with I (lane 4) or I (lane 5) and separated by contour-clamped homogeneous electric field gel electrophoresis as described previously ( ). The agarose gel (1.2% in 0.5 x TBE) was calibrated with 1 concatamer (lane 1) and run for 50 h at constant voltage (170 V). Fragments discussed in the text are indicated by arrowheads. The black arrow indicates the 480-kb I fragment, and the white arrows indicate 260- and 280-kb I fragments.

Image of Figure 3
Figure 3

Schematic diagram of the multiplex cosmid sequencing process. See text for explanation.

Image of Figure 4
Figure 4

Map of cosmid TBC2. A reference scale in kilobases is given below the map. The positions and orientations of genes are indicated on the map. Shaded arrows represent genes that have homology to other genes in public databases, and open arrows represent ORFs that do not have significant homologies to any known genes. Key to gene names: , pyruvate carboxylase; , 22-kDa putative cell cycle regulator; , homolog of a lipopolysaccharide operon gene; , phosphoribosylglycinamide formytransferase; , putative anion transporter; , putative UDP-sugar transferase; , rRNA methyl transferase; , putative methylase; , putative acetyl coenzyme A ligase; , homolog of the gene; , polyketide synthase (beta-keto reductase); , polyketide synthase (ketoacyl acyl carrier protein synthase); ,,, polyketide synthase (acyl transferase, dehydratase, enoyl reductase). See text for references.

Identities and sources of genetic markers mapped in

