SIGS Instructions to Authors :: Phylogenetic Trees

[SIGS Instructions to Authors]

Authors are encouraged to provide a phylogenetic tree showing the position of their sequenced strain relative to other strains having fully or partially sequenced genomes. It is recommended that the number of strains be limited to 20 or less.

Recommended criteria for selection:

  • taxonomic type strains or other well-known members (e.g., model organisms)
  • strains in culture collections
  • strains for which genome sequence is available
  • there is a reasonable phylogenetic spread (the tree has more than just a cluster of closely related strains belonging to the same species, subspecies, etc); when possible, we suggest including diverse members from the taxonomic family

Suggested tools for building a phylogenetic tree include:

Selection of outgroup: our current automation for generating draft phylogenetic trees selects the most distant member as the outgroup within the set of selected strains. Authors may consider a better option to involve the selection of an outgroup or outgroups from a sister clade or more distant lineages. A helpful article on outgroup selection is by Grandcolas et al. 2004.

Examples of trees and their associated figure captions are shown in Figures 1 and 2 below.

Example of SIGS short genome report phylogenetic tree as based on Bordetella petrii
Figure 1. Phylogenetic tree highlighting the position of B. petrii Se-1111R relative to other type strains within the Alcaligenaceae. Type strains shown are those within the Alcaligenaceae having fully sequenced genomes with assigned RefSeq numbers as of 2009/01/31. The type strains and their corresponding GenBank accession numbers for 16S rRNA genes are: B. pertussis, U04950; B. avium, U04947; B. petrii, AJ249861; B. parapertussis, U04949; and B. bronchiseptica, U04948. The tree uses sequences aligned by the RDP aligner, and uses the Jukes-Cantor corrected distance model to construct a distance matrix based on alignment model positions without the use of alignment inserts, and uses a minimum comparable position of 200. The tree is built with the RDP Tree Builder, which uses Weighbor (Bruno et al. 2000) with an alphabet size of 4 and a length size of 1000. The building of the tree also involves a bootstrapping process repeated 100 times to generate a majority consensus tree (Cole et al. 2007). B. bronchiseptica (U04948) was used as an outgroup.

Example of SIGS short genome report phylogenetic tree as based on Sphingomonas wittichii
Figure 2. Phylogenetic tree highlighting the position of Sphingomonas wittichii strain RW1 relative to other type and non-type strains within the Sphingomonadacaea. Strains shown are those within the Sphingomonadacaea having corresponding NCBI genome project ids listed within (Garrity et al., 2007). The strains and their corresponding GenBank accession numbers (and, when applicable, draft sequence coordinates) for 16S rRNA genes are (type=T): N. aromaticivorans strain SMCC F199T, U20756; Erythrobacter sp. strain NAP1, AAMW01000002.1:1127089-1128582; E. litoralis strain HTCC2594, CP000157; Sphingomonas sp. strain SKA58, AAQG01000001.1:1-836; S. wittichii strain RW1T, AB021492; and Z. mobilis strain ATCC 31821, AF281031. The tree uses sequences aligned by the RDP aligner, and uses the Jukes-Cantor corrected distance model to construct a distance matrix based on alignment model positions without the use of alignment inserts, and uses a minimum comparable position of 200. The tree is built with RDP Tree Builder, which uses Weighbor (Bruno et al. 2000) with an alphabet size of 4 and length size of 1000. The building of the tree also involves a bootstrapping process repeated 100 times to generate a majority consensus tree (Cole et al. 2007). Z. mobilis (AF281031) was used as an outgroup.

References

Bruno WJ, Socci ND, Halpern AL. Weighted neighbor joining: a likelihood-based approach to distance-based phylogeny reconstruction. Mol Biol Evol 2000;17(1):189-97. PMID: 10666718

Cole JR, Chai B, Farris RJ, Wang Q, Kulam-Syed-Mohideen AS, McGarrell DM, Bandela AM, Cardenas E, Garrity GM, Tiedje JM. The ribosomal database project (RDP-II): introducing myRDP space and quality controlled public data. Nucleic Acids Res 2007;35(Database issue):D169-72. doi: 10.1093/nar/gkl889

Felsenstein J. An alternating least squares approach to inferring phylogenies from pairwise distances. Syst Biol 1997;46(1):101-11. PMID: 11975348

Garrity GM, Lilburn TG, Cole JR, Harrison SH, Euzeby J, Tindall BJ. The Taxonomic Outline of Bacteria and Archaea version 7.7. Michigan State University Board of Trustees; 2007. http://taxonomicoutline.org

Grandcolas P, Guilbert E, Robillard T, D'Haese CA, Murienne J, Legendre F. Mapping characters on a tree with or without the outgroups. Cladistics 2004;20(6):579-582. doi: 10.1111/j.1096-0031.2004.00037.x

Ludwig W, Strunk O, Westram R, Richter L, Meier H, Yadhukumar, Buchner A, Lai T, Steppi S, Jobb G and others. ARB: a software environment for sequence data. Nucleic Acids Res 2004;32(4):1363-71. doi: 10.1093/nar/gkh293

Thompson JD, Higgins DG, Gibson TJ. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 1994;22(22):4673-80. PMID: 7984417



Acknowledgements

We would like to gratefully acknowledge the support of many members of the Genomic Standards Consortium, the broader genomic science community, and those who have indicated their willingness to serve as editors, reviewers and contributors.

Funding for SIGS is provided by a grant from the Office of the Vice President for Research and Graduate Studies at Michigan State University, the Michigan State University Foundation, and the US Department of Energy Biological and Environmental Research DE-FG02-08ER64707.

Standards in Genomic Sciences is indexed in:

Sponsors of the Genomic Standards Consortium: