Chapter 2 : PlasmoDB: The Genome Resource

PlasmoDB: The Genome Resource, Page 1 of 2

The Genome Database was introduced in 2000, in response to emerging needs of the malaria research community for access to genomic-scale datasets. In its earliest manifestations, prior to completion of the genome sequence and curated annotation, PlasmoDB focused on automated analysis of available sequence data, enabling researchers to identify draft sequences for specific genes of interest, even in the absence of the surrounding genomic context. The availability of an effectively complete genome sequence has stimulated a wide range of functional genomics research, and PlasmoDB has endeavored to keep pace with these studies, providing access to the underlying datasets, and allowing a variety of integrative queries, e.g., finding all genes for which both transcript and proteomics data suggest expression in gametocyte stage parasites. PlasmoDB will continue to integrate new datasets as they emerge, developing tools of interest to the malaria researcher and facilitating the discovery of new diagnostics, drugs, and vaccines. PlasmoDB provides the user with a variety of analysis tools for examining and extracting information from the genome and predicted proteome, using BLAST, electronic PCRs, defined motif searches, and tools for the analysis of microarray and proteomics data. Gene Pages provide an encyclopedic view of the genome, where it is possible to look up all information about a specified gene. parasites may be gleaned by browsing the genome in Sequence View mode, and information on specific genes may be obtained from the various Gene Pages.

Gene Queries using PlasmoDB. A wide variety of dynamic queries may be formulated to interrogate the PlasmoDB database, using pull-down menus available on the home page (also accessible via the Queries button on the blue tool bar, via the help and tutorial pages, and from links at relevant locations throughout the site). Three queries are shown, focused on characteristics that might be of interest to researchers seeking to mine the database for candidate vaccine antigens: (i) a search for genes predicted to encode a secretory signal sequence, (ii) a search for genes that are abundantly transcribed in late schizonts, and (iii) a search for genes that are conserved in but not in the human or mouse genomes.

Integrating gene queries by using the History function. The History page, accessible via the blue tool bar at the top of each page, provides a list of all queries conducted during the current session. Using the boxes at left to specify individual queries of continuing interest permits these datasets to be combined (union), subtracted, or intersected. Intersecting the three queries defined in Figure 1 yields 20 genes that satisfy the specified criteria with respect to phylogenetic distribution, expression, and predicted subcellular location. This list includes merozoite surface protein 1, apical membrane antigen 1, and numerous hypothetical proteins that may warrant further investigation as candidate vaccine antigens.

