Chapter 2 : The Genome, Genes, and Functions

The chapter discusses on the gemone project. Each group involved in the genome sequencing project chose its own strategies and methods to annotate the chromosome region it was responsible for. In a second step, annotation of the whole genome sequence was made consistent using Imagene, an integrated and cooperative computer environment for sequence annotation and analysis. Searching for protein coding sequences (CDSs) in the genome sequence was relatively easy because of the good conservation of the translation initiation sites with respect to the "consensus" sequence, defined from the complementary sequence of the 16S rRNA 3' end — this was probably due to the lack of S1 protein in ribosomes. During the ultimate round of annotation of the genome sequence, no minimal size for open reading frames was set, so that small genes--a number of which had already been identified and their functions precisely described--did not escape analysis. The ribosomal operons had already been sequenced before the advent of the genome project. Several independent strategies were developed using Imagene to detect sequencing errors in the final genome sequence. A small fraction of 4,106 protein-encoding genes are in operons in which the function of only one cistron is described; the functions of the others are assumed to be related. Finally, the chapter talks about the subtilist database and World Wide Web server.

Citation: Moszer I. 2002. The Genome, Genes, and Functions, p 7-11. In Sonenshein A, Losick R, Hoch J (ed), and Its Closest Relatives. ASM Press, Washington, DC. doi: 10.1128/9781555817992.ch2

Gene Expression and Regulation
