Chapter 9 : Molecular Phylogenetic Analysis

MyBook is a cheap paperback edition of the original book and will be sold at uniform, low price.

Preview this chapter:
Zoom in

Molecular Phylogenetic Analysis, Page 1 of 2

| /docserver/preview/fulltext/10.1128/9781555816834/9781555814977_Chap09-1.gif /docserver/preview/fulltext/10.1128/9781555816834/9781555814977_Chap09-2.gif


Phylogenetic analysis, in the strictest sense, is the process of testing hypotheses about the descent of species from a common ancestor. This chapter provides an overview of a task essential to all: obtaining a working hypothesis of the evolutionary relationships among a group of organisms, summarized as a phylogenetic tree. Molecular phylogenetic analysis became more powerful and more accessible with the advent of rapid, inexpensive DNA sequencing, eventually leading to a major revision in the understanding of the evolutionary relationships among all living organisms. In order to provide a statistically robust representation of the phylogeny, a phylogenetic marker needs to have a sufficient number of independently evolving positions so that at least several changes differentiate the most closely related taxa of interest. After a phylogenetic marker has been selected, one must obtain sequence data. The core of automated alignment algorithms is optimization of a scoring function. Points are added for identities and similarities at each position, points are subtracted for mismatches, and many points are subtracted for gaps. The chapter emphasizes the use of aligned sequence databases in preference to de novo alignment not only for the savings in time and effort but also for the higher quality. It discusses maximum parsimony and maximum likelihood, the most widely implemented character-based methods, as well as Bayesian phylogenetic inference, a relatively new method rapidly growing in popularity. Although it may be possible to identify an optimal topology, the reader is cautioned that an optimal tree does not guarantee the true phylogeny.

Citation: Dethlefsen L, Lepp P, Relman D. 2011. Molecular Phylogenetic Analysis, p 145-165. In Persing D, Tenover F, Tang Y, Nolte F, Hayden R, van Belkum A (ed), Molecular Microbiology. ASM Press, Washington, DC. doi: 10.1128/9781555816834.ch9
Highlighted Text: Show | Hide
Loading full text...

Full text loading...


Image of FIGURE 1

A typical workflow for molecular phylogenetic analysis.

Citation: Dethlefsen L, Lepp P, Relman D. 2011. Molecular Phylogenetic Analysis, p 145-165. In Persing D, Tenover F, Tang Y, Nolte F, Hayden R, van Belkum A (ed), Molecular Microbiology. ASM Press, Washington, DC. doi: 10.1128/9781555816834.ch9
Permissions and Reprints Request Permissions
Download as Powerpoint
Image of FIGURE 2

Orthologous, paralogous, and xenologous homology. Organisms and genomes are depicted as rectangles and ovals, respectively, with large versions for ancestors and small versions representing extant strains available for sampling. The white and black bars on the genome represent genes that illustrate the relationship corresponding to each term. The small shapes decorating some genes indicate that additional changes can occur, leading to recognizable younger clades, but are not meant to imply that other genes have remained identical to the ancestral sequence. (A) The white and black versions of the gene are orthologous homologs in clades 1 and 2, since the divergence of the clades is the event that separated the white and black gene lineages. Any comparison between a white and a black gene is expected to reveal the initial divergence equally well. (B) The white and black versions of the gene are paralogous, having arisen from a gene duplication within a single genome. The common ancestor of each clade carried both paralogs; the use of either the white or the black version as a phylogenetic marker would be expected to depict the organismal phylogeny equally well. However, one member of clade 1 has lost the white paralog; the unwitting substitution of the black paralog in its place would incorrectly identify that organism as having branched off prior to the split between clades 1 and 2. (C) The white version of the gene is a xenolog of the black version, since its presence in clade 1 is due to a horizontal gene transfer event rather than inheritance from the common ancestor of clade 1. The ancestral gene is shown in the common ancestor of clades 1 and 2, implying that it has been lost in the lineage leading to the ancestor of clade 1, but it might also have been absent in the common ancestor of both clades and originated in the lineage leading to clade 2. Although clade 1 is shown as having no homolog of the gene prior to the horizontal transfer, in some cases a xenologous gene may either displace or coexist with an orthologous homolog.

Citation: Dethlefsen L, Lepp P, Relman D. 2011. Molecular Phylogenetic Analysis, p 145-165. In Persing D, Tenover F, Tang Y, Nolte F, Hayden R, van Belkum A (ed), Molecular Microbiology. ASM Press, Washington, DC. doi: 10.1128/9781555816834.ch9
Permissions and Reprints Request Permissions
Download as Powerpoint
Image of FIGURE 3

BLAST interface at the website of the National Center for Biotechnology Information (NCBI). (A) Screen shot of the blastp (protein blast) query submission page. Numbered arrows highlight the following features: 1, button to access extensive online help; 2, user-generated job title for keeping track of multiple BLAST queries; 3, restriction of the search to a particular taxonomic group; 4, restriction of the search using an Entrez query, in this case to search only sequences from cultivated organisms (the unusual syntax, with an initial term that has no effect, is required in this case because the query is not permitted to begin with the Boolean term “NOT”); 5, algorithm parameters that can be customized but are not initially displayed. Usually choosing between the named algorithms above offers sufficient flexibility, but users with extensive or challenging searches may benefit from understanding and using the detailed parameter settings. (B) Screen shot of blastp results highlighting the following features: 1, panels to customize online and downloaded results format; 2, selection between multiple query sequences submitted in a single batch; 3, summary of the phylogenetic and taxonomic distribution of the BLAST hits; 4, links to the standard GenBank records; 5, links to “value-added” databases such as whole-genome sequences.

Citation: Dethlefsen L, Lepp P, Relman D. 2011. Molecular Phylogenetic Analysis, p 145-165. In Persing D, Tenover F, Tang Y, Nolte F, Hayden R, van Belkum A (ed), Molecular Microbiology. ASM Press, Washington, DC. doi: 10.1128/9781555816834.ch9
Permissions and Reprints Request Permissions
Download as Powerpoint
Image of FIGURE 4

Ribosomal Database Project website. (A) Screen shot of the initial page for the Classifier tool, which provides a taxonomic classification for the sequences selected by the user. Numbered arrows highlight the following features: 1, additional RDP tools including Seqmatch, which finds the sequence(s) in the database most similar to each submitted sequence, and Treebuilder, which interactively generates small trees online; 2, ability to upload new sequences into a private myRDP account for automated alignment and storage on the server; 3, extensive online help files; 4, ability to apply tools to private sequences as well as those in the RDP database. (B) Screen shot of Classifier results highlighting the following features: 1, user-adjustable stringency for classification; 2, ability to download all sequence classifications as a single file; 3, numbers indicate the distribution of the submitted sequences to various taxa: clicking on taxon names will show which sequences are classified into each taxon.

Citation: Dethlefsen L, Lepp P, Relman D. 2011. Molecular Phylogenetic Analysis, p 145-165. In Persing D, Tenover F, Tang Y, Nolte F, Hayden R, van Belkum A (ed), Molecular Microbiology. ASM Press, Washington, DC. doi: 10.1128/9781555816834.ch9
Permissions and Reprints Request Permissions
Download as Powerpoint
Image of FIGURE 5

Pfam website. (A) Screen shot of results from a protein sequence search. Numbered arrows highlight the following features: 1, extensive online help files; 2, architecture (protein domain structure) found in the query sequence; 3, table with information about each domain found in the query; 4, alignment shown between one domain of the query sequence and the hidden Markov model which describes the sequence information conserved in the domain family; 5, link to complete information about that particular Pfam domain family. (B) Screen shot of the home page for a Pfam domain family, highlighting the following features: 1, different categories of information available about the domain; 2, listing of all known protein architectures containing the domain; 3, link to all protein sequences of a particular architecture.

Citation: Dethlefsen L, Lepp P, Relman D. 2011. Molecular Phylogenetic Analysis, p 145-165. In Persing D, Tenover F, Tang Y, Nolte F, Hayden R, van Belkum A (ed), Molecular Microbiology. ASM Press, Washington, DC. doi: 10.1128/9781555816834.ch9
Permissions and Reprints Request Permissions
Download as Powerpoint
Image of FIGURE 6

Alignment. (A) The region shown for hypothetical Sequences 1 to 3 can be aligned unambiguously, although the comparison between Sequences 1 and 2 indicates a mismatch and the comparison between Sequences 1 and 3 indicates an indel (insertion-deletion event). The alignment is an assertion that the nucleotides in a single column are homologous, i.e., descended from a single ancestral nucleotide. However, Sequence 4 cannot be aligned unambiguously with the information given; two possible alternatives are shown. See the text for additional discussion. (B) Alignment software. A screen shot of the Arb sequence editor is shown, which is one component of the Arb phylogenetic software designed especially for 16S rRNA data. The following features are highlighted: 1, sequence coordinates corresponding to the cursor position; 2, search fields for sequence motifs; 3, 16S rRNA sequence for reference; 4, symbols indicating position-specific secondary and tertiary interactions predicted by a model of 16S rRNA structure; 5, a sequence mask, with plus signs indicating columns aligned with sufficient confidence to be included in subsequent analysis; 6, aligned columns of nucleotides, color coded to assist in manual editing. Symbols underlining some nucleotides are indications of the potential for that nucleotide to participate in an interaction predicted by the structural model.

Citation: Dethlefsen L, Lepp P, Relman D. 2011. Molecular Phylogenetic Analysis, p 145-165. In Persing D, Tenover F, Tang Y, Nolte F, Hayden R, van Belkum A (ed), Molecular Microbiology. ASM Press, Washington, DC. doi: 10.1128/9781555816834.ch9
Permissions and Reprints Request Permissions
Download as Powerpoint


1. Alfaro, M.,, and J. Huelsenbeck. 2006. Comparative performance of Bayesian and AIC-based measures of phylogenetic model uncertainty. Syst. Biol. 55: 8996.
2. Alfaro, M. E.,, and M. T. Holder. 2006. The posterior and the prior in Bayesian phylogenetics. Annu. Rev. Ecol. Evol. Syst. 37: 1942.
3. Altschul, S.,, W. Gish,, W. Miller,, E. Myers, and, D. Lipman. 1990. Basic local alignment search tool. J. Mol. Biol. 215: 403410.
4. Armougom, F.,, S. Moretti,, O. Poirot,, S. Audic,, P. Dumas,, B. Schaeli,, V. Keduas, and, C. Notredame. 2006. Expresso: automatic incorporation of structural information in multiple sequence alignments using 3D-Coffee. Nucleic Acids Res. 34: W604W608.
5. Ashelford, K.,, N. Chuzhanova,, J. Fry,, A. Jones, and, A. Weightman. 2006. New screening software shows that most recent large 16S rRNA gene clone libraries contain chimeras. Appl. Environ. Microbiol. 72: 57345741.
6. Bapteste, E.,, E. Susko,, J. Leigh,, D. MacLeod,, R. Charlebois, and, W. Doolittle. 2005. Do orthologous gene phylogenies really support tree-thinking? BMC Evol. Biol. 5: 33.
7. Bergsten, J. 2005. A review of long-branch attraction. Cladistics 21: 163193.
8. Brochier, C.,, P. Forterre, and, S. Gribaldo. 2005. An emerging phylogenetic core of Archaea: phylogenies of transcription and translation machineries converge following addition of new genome sequences. BMC Evol. Biol. 5: 36.
9. Cannone, J.,, S. Subramanian,, M. Schnare,, J. Collett,, L. D’Souza,, Y. Du,, B. Feng,, N. Lin,, L. Madabusi,, K. Müller,, N. Pande,, Z. Shang,, N. Yu, and, R. Gutell. 2002. The comparative RNA web (CRW) site: an online database of comparative sequence and structure information for ribosomal, intron, and other RNAs. BMC Bioinform. 3: 2.
10. Chen, F.,, K. Wang,, J. Kan,, M. Suzuki, and, K. Wommack. 2006. Diverse and unique picocyanobacteria in Chesapeake Bay, revealed by 16S-23S rRNA internal transcribed spacer sequences. Appl. Environ. Microbiol. 72: 22392243.
11. Ciccarelli, F.,, T. Doerks,, C. von Mering,, C. Creevey,, B. Snel, and, P. Bork. 2006. Toward automatic reconstruction of a highly resolved tree of life. Science 311: 12831287.
12. Cole, J.,, B. Chai,, R. Farris,, Q. Wang,, A. Kulam-Syed-Mohideen,, D. McGarrell,, A. Bandela,, E. Cardenas,, G. Garrity, and, J. Tiedje. 2007. The ribosomal database project (RDP-II): introducing myRDP space and quality controlled public data. Nucleic Acids Res. 35: D169D172.
13. Delsuc, F.,, H. Brinkmann, and, H. Philippe. 2005. Phylogenomics and the reconstruction of the tree of life. Nat. Rev. Genet. 6: 361375.
14. DeSantis, T.,, P. Hugenholtz,, N. Larsen,, M. Rojas,, E. Brodie,, K. Keller,, T. Huber,, D. Dalevi,, P. Hu, and, G. Andersen. 2006. Greengenes, a chimera-checked 16S rRNA gene database and workbench compatible with ARB. Appl. Environ. Microbiol. 72: 50695072.
15. Desper, R.,, and O. Gascuel. 2002. Fast and accurate phylogeny reconstruction algorithms based on the minimumevolution principle. J. Comput. Biol. 9: 687705.
16. Desper, R.,, and O. Gascuel. 2004. Theoretical foundation of the balanced minimum evolution method of phylogenetic inference and its relationship to weighted least-squares tree fitting. Mol. Biol. Evol. 21: 587598.
17. Douady, C.,, F. Delsuc,, Y. Boucher,, W. Doolittle, and, E. Douzery. 2003. Comparison of Bayesian and maximum likelihood bootstrap measures of phylogenetic reliability. Mol. Biol. Evol. 20: 248254.
18. Drummond, A.,, and A. Rambaut. 2007. BEAST: Bayesian evolutionary analysis by sampling trees. BMC Evol. Biol. 7: 214.
19. Drummond, D.,, A. Raval, and, C. Wilke. 2006. A single determinant dominates the rate of yeast protein evolution. Mol. Biol. Evol. 23: 327337.
20. Edgar, R. 2004. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 32: 17921797.
21. Ewing, B.,, and P. Green. 1998. Base-calling of automated sequencer traces using phred. II. Error probabilities. Genome Res. 8: 186194.
22. Ewing, B.,, L. Hillier,, M. Wendl, and, P. Green. 1998. Base-calling of automated sequencer traces using phred. I. Accuracy assessment. Genome Res. 8: 175185.
23. Felsenstein, J. 1981. Evolutionary trees from DNA sequences: a maximum likelihood approach. J. Mol. Evol. 17: 368376.
24. Felsenstein, J. 2004. Inferring Phylogenies, 2nd ed. Sinauer, Sunderland, MA.
25. Finn, R.,, J. Tate,, J. Mistry,, P. Coggill,, S. Sammut,, H. Hotz,, G. Ceric,, K. Forslund,, S. Eddy,, E. Sonnhammer, and, A. Bateman. 2008. The Pfam protein families database. Nucleic Acids Res. 36: D281D288.
26. Fitch, W. M. 1971. Toward defining the course of evolution: minimal change for a specific tree topology. Syst. Zool. 20: 406416.
27. Gascuel, O. 1997. BIONJ: an improved version of the NJ algorithm based on a simple model of sequence data. Mol. Biol. Evol. 14: 685695.
28. Gascuel, O.,, and M. Steel. 2006. Neighbor-joining revealed. Mol. Biol. Evol. 23: 19972000.
29. Gordon, D. 2003. Viewing and editing assembled sequences using Consed. Curr. Protoc. Bioinformatics Chapter 11:Unit11.2.
30. Gordon, D.,, C. Abajian, and, P. Green. 1998. Consed: a graphical tool for sequence finishing. Genome Res. 8: 195202.
31. Guindon, S.,, and O. Gascuel. 2003. A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst. Biol. 52: 696704.
32. Hall, B. G. 2007. Phylogenetic Trees Made Easy: A How-To Manual, 3rd ed. Sinauer, Sunderland, MA.
33. Henikoff, S.,, and J. Henikoff. 1992. Amino acid substitution matrices from protein blocks. Proc. Natl. Acad. Sci. USA 89: 1091510919.
34. Holder, M.,, and P. Lewis. 2003. Phylogeny estimation: traditional and Bayesian approaches. Nat. Rev. Genet. 4: 275284.
35. Huelsenbeck, J. 1995. The robustness of two phylogenetic methods: four-taxon simulations reveal a slight superiority of maximum likelihood over neighbor joining. Mol. Biol. Evol. 12: 843849.
36. Huelsenbeck, J.,, and K. Lander. 2003. Frequent inconsistency of parsimony under a simple model of cladogenesis. Syst. Biol. 52: 641648.
37. Huelsenbeck, J.,, B. Larget,, R. Miller, and, F. Ronquist. 2002. Potential applications and pitfalls of Bayesian inference of phylogeny. Syst. Biol. 51: 673688.
38. Höhl, M.,, and M. Ragan. 2007. Is multiple-sequence alignment required for accurate inference of phylogeny? Syst. Biol. 56: 206221.
39. Jin, L.,, and M. Nei. 1990. Limitations of the evolutionary parsimony method of phylogenetic analysis. Mol. Biol. Evol. 7: 82102.
40. Jukes, T. H.,, and C. R. Cantor. 1969. Evolution of protein molecules, p. 21-132. In H. N. Munro (ed.), Mammalian Protein Metabolism. Academic Press, New York, NY.
41. Kimura, M. 1980. A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. J. Mol. Evol. 16: 111120.
42. Kolaczkowski, B.,, and J. Thornton. 2007. Effects of branch length uncertainty on Bayesian posterior probabilities for phylogenetic hypotheses. Mol. Biol. Evol. 24: 21082118.
43. Landan, G.,, and D. Graur. 2007. Heads or tails: a simple reliability check for multiple sequence alignments. Mol. Biol. Evol. 24: 13801383.
44. Lerat, E.,, V. Daubin,, H. Ochman, and, N. Moran. 2005. Evolutionary origins of genomic repertoires in bacteria. PLoS Biol. 3: e130.
45. Ludwig, W.,, O. Strunk,, R. Westram,, L. Richter,, H. Meier,, Yadhukumar,, A. Buchner,, T. Lai,, S. Steppi,, G. Jobb,, W. Förster,, I. Brettske,, S. Gerber,, A. Ginhart,, O. Gross,, S. Grumann,, S. Hermann,, R. Jost,, A. König,, T. Liss,, R. Lüssmann,, M. May,, B. Nonhoff,, B. Reichel,, R. Strehlow,, A. Stamatakis,, N. Stuckmann,, A. Vilbig,, M. Lenke,, T. Ludwig,, A. Bode, and, K. Schleifer. 2004. ARB: a software environment for sequence data. Nucleic Acids Res. 32: 13631371.
46. Lunter, G.,, I. Miklós,, A. Drummond,, J. Jensen, and, J. Hein. 2005. Bayesian coestimation of phylogeny and sequence alignment. BMC Bioinform. 6: 83.
47. Moretti, S.,, F. Armougom,, I. Wallace,, D. Higgins,, C. Jongeneel, and, C. Notredame. 2007. The M-Coffee web server: a meta-method for computing multiple sequence alignments by combining alternative alignment methods. Nucleic Acids Res. 35: W645W648.
48. Mossel, E.,, and E. Vigoda. 2005. Phylogenetic MCMC algorithms are misleading on mixtures of trees. Science 309: 22072209.
49. Nei, M.,, S. Kumar, and, K. Takahashi. 1998. The optimization principle in phylogenetic analysis tends to give incorrect topologies when the number of nucleotides or amino acids used is small. Proc. Natl. Acad. Sci. USA 95: 1239012397.
50. Ni, Y.,, D. Wan, and, K. He. 2008. 16S rDNA and 16S-23S internal transcribed spacer sequence analyses reveal inter- and intraspecific Acidithiobacillus phylogeny. Microbiology 154: 23972407.
51. Notredame, C. 2007. Recent evolutions of multiple sequence alignment algorithms. PLoS Comput. Biol. 3: e123.
52. Nylander, J.,, F. Ronquist,, J. Huelsenbeck, and, J. Nieves-Aldrey. 2004. Bayesian phylogenetic analysis of combined data. Syst. Biol. 53: 4767.
53. Ochman, H.,, E. Lerat, and, V. Daubin. 2005. Examining bacterial species under the specter of gene transfer and exchange. Proc. Natl. Acad. Sci. USA 102 (Suppl. 1): 65956599.
54. Olsen, G.,, H. Matsuda,, R. Hagstrom, and, R. Overbeek. 1994. fastDNAmL: a tool for construction of phylogenetic trees of DNA sequences using maximum likelihood. Comput. Appl. Biosci. 10: 4148.
55. Pei, J.,, and N. Grishin. 2007. PROMALS: towards accurate multiple sequence alignments of distantly related proteins. Bioinformatics 23: 802808.
56. Posada, D. 2008. jModelTest: phylogenetic model averaging. Mol. Biol. Evol. 25: 12531256.
57. Posada, D.,, and T. Buckley. 2004. Model selection and model averaging in phylogenetics: advantages of Akaike information criterion and Bayesian approaches over likelihood ratio tests. Syst. Biol. 53: 793808.
58. Posada, D.,, and K. Crandall. 2001. Selecting the best-fit model of nucleotide substitution. Syst. Biol. 50: 580601.
59. Pruesse, E.,, C. Quast,, K. Knittel,, B. Fuchs,, W. Ludwig,, J. Peplies, and, F. Glöckner. 2007. SILVA: a comprehensive online resource for quality checked and aligned ribosomal RNA sequence data compatible with ARB. Nucleic Acids Res. 35: 71887196.
60. Pybus, O.,, A. Drummond,, T. Nakano,, B. Robertson, and, A. Rambaut. 2003. The epidemiology and iatrogenic transmission of hepatitis C virus in Egypt: a Bayesian coalescent approach. Mol. Biol. Evol. 20: 381387.
61. Ronquist, F.,, and J. Huelsenbeck. 2003. MrBayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics 19: 15721574.
62. Saitou, N.,, and M. Nei. 1987. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4: 406425.
63. Sharp, P. M.,, D. L. Roberston, and, B. H. Hahn. 1996. Cross-species transmission and recombination of ‘AIDS’ viruses, p. 134-152. In P. H. Harvey,, A. J. L. Brown,, J. M. Smith, and, S. Nee (ed.), New Uses for New Phylogenies. Oxford University Press, New York, NY.
64. Sims, G.,, S. Jun,, G. Wu, and, S. Kim. 2009. Alignment-free genome comparison with feature frequency profiles (FFP) and optimal resolutions. Proc. Natl. Acad. Sci. USA 106: 26772682.
65. Stamatakis, A.,, and M. Ott. 2008. Efficient computation of the phylogenetic likelihood function on multi-gene alignments and multi-core architectures. Philos. Trans. R. Soc. Lond. B 363: 39773884.
66. Sugimoto, C.,, M. Hasegawa,, A. Kato,, H. Zheng,, H. Ebihara,, F. Taguchi,, T. Kitamura, and, Y. Yogo. 2002. Evolution of human polyomavirus JC: implications for the population history of humans. J. Mol. Evol. 54: 285297.
67. Suzuki, Y.,, G. Glazko, and, M. Nei. 2002. Overcredibility of molecular phylogenies obtained by Bayesian phylogenetics. Proc. Natl. Acad. Sci. USA 99: 1613816143.
68. Swofford, D.,, P. Waddell,, J. Huelsenbeck,, P. Foster,, P. Lewis, and, J. Rogers. 2001. Bias in phylogenetic estimation and its relevance to the choice between parsimony and likelihood methods. Syst. Biol. 50: 525539.
69. Takahashi, K.,, and M. Nei. 2000. Efficiencies of fast algorithms of phylogenetic inference under the criteria of maximum parsimony, minimum evolution, and maximum likelihood when a large number of sequences are used. Mol. Biol. Evol. 17: 12511258.
70. Talavera, G.,, and J. Castresana. 2007. Improvement of phylogenies after removing divergent and ambiguously aligned blocks from protein sequence alignments. Syst. Biol. 56: 564577.
71. Tamura, K.,, J. Dudley,, M. Nei, and, S. Kumar. 2007. MEGA4: Molecular Evolutionary Genetics Analysis (MEGA) software version 4.0. Mol. Biol. Evol. 24: 15961599.
72. Tamura, K.,, M. Nei, and, S. Kumar. 2004. Prospects for inferring very large phylogenies by using the neighbor-joining method. Proc. Natl. Acad. Sci. USA 101: 1103011035.
73. Thompson, J.,, D. Higgins, and, T. Gibson. 1994. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22: 46734680.
74. Vinga, S.,, and J. Almeida. 2003. Alignment-free sequence comparison—a review. Bioinformatics 19: 513523.
75. Wang, Q.,, G. Garrity,, J. Tiedje, and, J. Cole. 2007. Naive Bayesian classifier for rapid assignment of rRNA sequences into the new bacterial taxonomy. Appl. Environ. Microbiol. 73: 52615267.
76. Webb, B.,, J. Liu, and, C. Lawrence. 2002. BALSA: Bayesian algorithm for local sequence alignment. Nucleic Acids Res. 30: 12681277.
77. Woese, C.,, O. Kandler, and, M. Wheelis. 1990. Towards a natural system of organisms: proposal for the domains Archaea, Bacteria, and Eucarya. Proc. Natl. Acad. Sci. USA 87: 45764579.
78. Wróbel, B. 2008. Statistical measures of uncertainty for branches in phylogenetic trees inferred from molecular sequences by using model-based methods. J. Appl. Genet. 49: 4967.
79. Yang, Z. 2008. Empirical evaluation of a prior for Bayesian phylogenetic inference. Philos. Trans. R. Soc. Lond. B 363: 40314039.
80. Yarza, P.,, M. Richter,, J. Peplies,, J. Euzeby,, R. Amann,, K. Schleifer,, W. Ludwig,, F. Glöckner, and, R. Rosselló-Móra. 2008. The All-Species Living Tree project: a 16S rRNA-based phylogenetic tree of all sequenced type strains. Syst. Appl. Microbiol. 31: 241250.
81. Zuckerkandl, E.,, and L. Pauling. 1965. Molecules as documents of evolutionary history. J. Theor. Biol. 8: 357366.

This is a required field
Please enter a valid email address
Please check the format of the address you have entered.
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error