Digital DNA-DNA hybridization for microbial species delineation by means of genome-to-genome sequence comparison

Alexander F. Auch, Mathias von Jan, Hans-Peter Klenk, Markus Göker

Abstract


The pragmatic species concept for Bacteria and Archaea is ultimately based on DNA-DNA hybridization (DDH). While enabling the taxonomist, in principle, to obtain an estimate of the overall similarity between the genomes of two strains, this technique is tedious and error-prone and cannot be used to incrementally build up a comparative database. Recent technological progress in the area of genome sequencing calls for bioinformatics methods to replace the wet-lab DDH by in-silico genome-to-genome comparison. Here we investigate state-of-the-art methods for inferring whole-genome distances in their ability to mimic DDH. Algorithms to efficiently determine high-scoring sequence pairs or maximally unique matches perform well as a basis of inferring intergenomic distances.  The examined distance functions, which are able to cope with heavily reduced genomes and repetitive sequence regions, outperform previously described ones regarding the correlation with and error ratios in emulating DDH. Simulation of incompletely sequenced genomes indicates that some distance formulas are very robust against missing fractions of genomic information. Digitally derived genome-to-genome distances show a better correlation with 16S rRNA gene sequence distances than DDH values. The future perspectives of genome-informed taxonomy are discussed, and the investigated methods are made available as a web service for genome-based species delineation.

 

DOI: 10.4056/sigs.531120


Keywords


Archaea, Bacteria, BLAST, GBDP, genomics, MUMmer, phylogeny, species concept, taxonomy

Full Text: HTML PDF

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.

This article doi:10.4056/sigs.531120 has been cited by 59 other articles:

Roseivivax atlanticus sp. nov., isolated from surface seawater of the Atlantic Ocean
Li et al.
Antonie van Leeuwenhoek 105(5) 863.
10.1007/s10482-014-0140-5

Taxonomic use of DNA G+C content and DNA-DNA hybridization in the genomic age
Meier-Kolthoff et al.
INTERNATIONAL JOURNAL OF SYSTEMATIC AND EVOLUTIONARY MICROBIOLOGY 64(Pt 2) 352.
10.1099/ijs.0.056994-0

A genome sequence-based approach to taxonomy of the genus Nocardia
Tamura et al.
Antonie van Leeuwenhoek 102(3) 481.
10.1007/s10482-012-9780-5

Non-contiguous finished genome sequence of plant-growth promoting Serratia proteamaculans S4
Neupane et al.
Stand. Genomic Sci. 8(3) 441.
10.4056/sigs.4027757

Comparative genomics of Neisseria weaveri clarifies the taxonomy of this species and identifies genetic determinants that may be associated with virulence
Yi et al.
FEMS Microbiol Lett 328(2) 100.
10.1111/j.1574-6968.2011.02485.x

Genome sequence of the flexirubin-pigmented soil bacterium Niabella soli type strain (JS13-8T)
Anderson et al.
Stand. Genomic Sci. 7(2) 210.
10.4056/sigs.3117229

Complete genome sequence of the termite hindgut bacterium Spirochaeta coccoides type strain (SPN1T), reclassification in the genus Sphaerochaeta as Sphaerochaeta coccoides comb. nov. and emendations of the family Spirochaetaceae and the genus Sphaerochaeta
Abt et al.
Stand. Genomic Sci. 6(2) 194.
10.4056/sigs.2796069

Genome sequence of Phaeobacter daeponensis type strain (DSM 23529T), a facultatively anaerobic bacterium isolated from marine sediment, and emendation of Phaeobacter daeponensis
Dogs et al.
Stand. Genomic Sci. 9(1) 142.
10.4056/sigs.4287962

Complete genome sequence of the moderately thermophilic mineral-sulfide-oxidizing firmicute Sulfobacillus acidophilus type strain (NALT)
Anderson et al.
Stand. Genomic Sci. 6(3) 293.
10.4056/sigs.2736042

Complete Genomic Sequence of "Thermofilum adornatus" Strain 1910bT, a Hyperthermophilic Anaerobic Organotrophic Crenarchaeon
Dominova et al.
Genome Announcements 1(5) e00726-13.
10.1128/genomeA.00726-13

Non-contiguous finished genome sequence and contextual data of the filamentous soil bacterium Ktedonobacter racemifer type strain (SOSP1-21T)
Chang et al.
Stand. Genomic Sci. 5(1) 97.
10.4056/sigs.2114901

Geographic divergence of “Sulfolobus islandicus” strains assessed by genomic analyses including electronic DNA hybridization confirms they are geovars
Zuo et al.
Antonie van Leeuwenhoek 105(2) 431.
10.1007/s10482-013-0081-4

Relationship of Bacillus amyloliquefaciens clades associated with strains DSM 7T and FZB42T: a proposal for Bacillus amyloliquefaciens subsp. amyloliquefaciens subsp. nov. and Bacillus amyloliquefaciens subsp. plantarum subsp. nov. based on complete genome sequence comparisons
Borriss et al.
INTERNATIONAL JOURNAL OF SYSTEMATIC AND EVOLUTIONARY MICROBIOLOGY 61(8) 1786.
10.1099/ijs.0.023267-0

Complete genome sequence of the halophilic bacterium Spirochaeta africana type strain (Z-7692T) from the alkaline Lake Magadi in the East African Rift
Liolos et al.
Stand. Genomic Sci. 8(2) 165.
10.4056/sigs.3607108

Mycoplasma feriruminatoris sp. nov., a fast growing Mycoplasma species isolated from wild Caprinae
Jores et al.
Systematic and Applied Microbiology 36(8) 533.
10.1016/j.syapm.2013.07.005

Complete genome sequence of the plant-associated Serratia plymuthica strain AS13
Neupane et al.
Stand. Genomic Sci. 7(1) 22.
10.4056/sigs.2966299

Genome sequence of the moderately thermophilic halophile Flexistipes sinusarabici strain (MAS10T)
Lapidus et al.
Stand. Genomic Sci. 5(1) 86.
10.4056/sigs.2235024

Complete genome sequence of Polynucleobacter necessarius subsp. asymbioticus type strain (QLW-P1DMWA-1T)
Meincke et al.
Stand. Genomic Sci. 6(1) 74.
10.4056/sigs.2395367

Genome sequence and emended description of Leisingera nanhaiensis strain DSM 24252T isolated from marine sediment
Breider et al.
Stand. Genomic Sci. 9(3) 585.
10.4056/sigs.3828824

Complete genome sequence of the gliding, heparinolytic Pedobacter saltans type strain (113T)
Liolios et al.
Stand. Genomic Sci. 5(1) 30.
10.4056/sigs.2154937

Genome sequence of the moderately thermophilic sulfur-reducing bacterium Thermanaerovibrio velox type strain (Z-9701T) and emended description of the genus Thermanaerovibrio
Palaniappan et al.
Stand. Genomic Sci. 9(1) 57.
10.4056/sigs.4237901

Complete genome sequence of Serratia plymuthica strain AS12
Neupane et al.
Stand. Genomic Sci. 6(2) 165.
10.4056/sigs.2705996

Complete genome sequence of Mahella australiensis type strain (50-1 BONT)
Sikorski et al.
Stand. Genomic Sci. 4(3) 331.
10.4056/sigs.1864526

Advantages and limitations of genomics in prokaryotic taxonomy
Sentausa and Fournier
Clin Microbiol Infect 19(9) 790.
10.1111/1469-0691.12181

Complete genome sequence of the marine methyl-halide oxidizing Leisingera methylohalidivorans type strain (DSM 14336T), a representative of the Roseobacter clade
Buddruhs et al.
Stand. Genomic Sci. 9(1) 128.
10.4056/sigs.4297965

Towards a genome based taxonomy of Mycoplasmas
Thompson et al.
Infection, Genetics and Evolution 11(7) 1798.
10.1016/j.meegid.2011.07.020

Genome sequence of Phaeobacter caeruleus type strain (DSM 24564T), a surface-associated member of the marine Roseobacter clade
Beyersmann et al.
Stand. Genomic Sci. 8(3) 403.
10.4056/sigs.3927626

Highly parallelized inference of large genome-based phylogenies
Meier-Kolthoff et al.
Concurrency Computat.: Pract. Exper. () n/a.
10.1002/cpe.3112

Complete genome sequence of Polynucleobacter necessarius subsp. asymbioticus type strain (QLW-P1DMWA-1T)
Meincke et al.
Stand. Genomic Sci. 6(1) 1.
10.4056/sigs.2445005

Genomic basis of broad host range and environmental adaptability of Rhizobium tropici CIAT 899 and Rhizobium sp. PRF 81 which are used in inoculants for common bean (Phaseolus vulgaris L.)
Ormeño-Orrillo et al.
BMC Genomics 13(1) 735.
10.1186/1471-2164-13-735

Draft Genome Sequence of the Sulfolobales Archaeon AZ1, Obtained through Metagenomic Analysis of a Mexican Hot Spring
Servin-Garciduenas and Martinez-Romero
Genome Announcements 2(2) e00164-14.
10.1128/genomeA.00164-14

Genome sequence-based species delimitation with confidence intervals and improved distance functions
Meier-Kolthoff et al.
BMC Bioinformatics 14(1) 60.
10.1186/1471-2105-14-60

Permanent draft genome sequence of the gliding predator Saprospira grandis strain Sa g1 (= HR1)
Mavromatis et al.
Stand. Genomic Sci. 6(2) 210.
10.4056/sigs.2816096

Complete genome sequence of the rapeseed plant-growth promoting Serratia plymuthica strain AS9
Neupane et al.
Stand. Genomic Sci. 6(1) 54.
10.4056/sigs.2595762

Complete genome sequence of Geodermatophilus obscurus type strain (G-20T)
Ivanova et al.
Stand. Genomic Sci. 2(2) 158.
10.4056/sigs.711311

Genome sequence of the thermophilic fresh-water bacterium Spirochaeta caldaria type strain (H1T), reclassification of Spirochaeta caldaria, Spirochaeta stenostrepta, and Spirochaeta zuelzerae in the genus Treponema as Treponema caldaria comb. nov., Treponema stenostrepta comb. nov., and Treponema zuelzerae comb. nov., and emendation of the genus Treponema
Abt et al.
Stand. Genomic Sci. 8(1) 88.
10.4056/sigs.3096473

Polyphasic evidence supporting the reclassification of Bradyrhizobium japonicum group Ia strains as Bradyrhizobium diazoefficiens sp. nov.
Delamuta et al.
INTERNATIONAL JOURNAL OF SYSTEMATIC AND EVOLUTIONARY MICROBIOLOGY 63(Pt 9) 3342.
10.1099/ijs.0.049130-0

Edwardsiella piscicida sp. nov.,a novel species pathogenic to fish
Abayneh et al.
J Appl Microbiol 114(3) 644.
10.1111/jam.12080

Complete genome sequence of Cellulophaga lytica type strain (LIM-21T)
Pati et al.
Stand. Genomic Sci. 4(2) 221.
10.4056/sigs.1774329

Genotype to phenotype: identification of diagnostic vibrio phenotypes using whole genome sequences
Amaral et al.
INTERNATIONAL JOURNAL OF SYSTEMATIC AND EVOLUTIONARY MICROBIOLOGY 64(Pt 2) 357.
10.1099/ijs.0.057927-0

Non-contiguous finished genome sequence of Bacteroides coprosuis type strain (PC139T)
Land et al.
Stand. Genomic Sci. 4(2) 233.
10.4056/sigs.1784330

Genome sequence of Phaeobacter inhibens type strain (T5T), a secondary metabolite producing representative of the marine Roseobacter clade, and emendation of the species description of Phaeobacter inhibens
Dogs et al.
Stand. Genomic Sci. 9(2) 334.
10.4056/sigs.4448212

Complete genome sequence of the hyperthermophilic chemolithoautotroph Pyrolobus fumarii type strain (1AT)
Anderson et al.
Stand. Genomic Sci. 4(3) 381.
10.4056/sigs.2014648

Genomic insights into the taxonomic status of the three subspecies of Bacillus subtilis
Yi et al.
Systematic and Applied Microbiology 37(2) 95.
10.1016/j.syapm.2013.09.006

Genome sequence of the chemoheterotrophic soil bacterium Saccharomonospora cyanea type strain (NA-134T)
Meier-Kolthoff et al.
Stand. Genomic Sci. 9(1) 28.
10.4056/sigs.4207886

Microbial genomic taxonomy
Thompson et al.
BMC Genomics 14(1) 913.
10.1186/1471-2164-14-913

Genomic Taxonomy of the Genus Prochlorococcus
Thompson et al.
Microb Ecol 66(4) 752.
10.1007/s00248-013-0270-8

Zunongwangia atlantica sp. nov., isolated from deep-sea water
Shao et al.
INTERNATIONAL JOURNAL OF SYSTEMATIC AND EVOLUTIONARY MICROBIOLOGY 64(Pt 1) 16.
10.1099/ijs.0.054007-0

Genome sequence of the phage-gene rich marine Phaeobacter arcticus type strain DSM 23566T
Freese et al.
Stand. Genomic Sci. 8(3) 450.
10.4056/sigs.383362

Genome sequence of the Leisingera aquimarina type strain (DSM 24565T), a member of the marine Roseobacter clade rich in extrachromosomal elements
Riedel et al.
Stand. Genomic Sci. 8(3) 389.
10.4056/sigs.3858183

Taxonomy and evolution of bacteriochlorophyll a-containing members of the OM60/NOR5 clade of marine gammaproteobacteria: description of Luminiphilus syltensis gen. nov., sp. nov., reclassification of Haliea rubra as Pseudohaliea rubra gen. nov., comb. nov., and emendation of Chromatocurvus halotolerans
Spring et al.
BMC Microbiol 13(1) 118.
10.1186/1471-2180-13-118

When should a DDH experiment be mandatory in microbial taxonomy?
Meier-Kolthoff et al.
Arch Microbiol 195(6) 413.
10.1007/s00203-013-0888-4

Standard operating procedure for calculating genome-to-genome distances based on high-scoring segment pairs
Auch et al.
Stand. Genomic Sci. 2(1) 142.
10.4056/sigs.541628

Ribosomal and protein coding gene based multigene phylogeny on the family Streptomycetaceae
Han et al.
Systematic and Applied Microbiology 35(1) 1.
10.1016/j.syapm.2011.08.007

Complete genome sequence of Ignisphaera aggregans type strain (AQ1.S1T)
Göker et al.
Stand. Genomic Sci. 3(1) 66.
10.4056/sigs.1072907

Bacillus xiamenensis sp. nov., isolated from intestinal tract contents of a flathead mullet (Mugil cephalus)
Lai et al.
Antonie van Leeuwenhoek 105(1) 99.
10.1007/s10482-013-0057-4

Complete genome sequence of the thermophilic, hydrogen-oxidizing Bacillus tusciae type strain (T2T) and reclassification in the new genus, Kyrpidia gen. nov. as Kyrpidia tusciae comb. nov. and emendation of the family Alicyclobacillaceae da Costa and Rainey, 2010.
Klenk et al.
Stand. Genomic Sci. 5(1) 121.
10.4056/sigs.2144922

Complete genome sequence of Hydrogenobacter thermophilus type strain (TK-6T)
Zeytun et al.
Stand. Genomic Sci. 4(2) 131.
10.4056/sigs.1463589

En route to a genome-based classification of Archaea and Bacteria?
Klenk and Göker
Systematic and Applied Microbiology 33(4) 175.
10.1016/j.syapm.2010.03.003




Announcement

SIGS is currently in the process of transferring to a new publisher, BioMed Central. Although the Editor-in-Chief and the web address will remain the same, as of March 1, 2014 all Genome Reports submitted to SIGS will be published by BioMed Central and will incur an author processing charge (APC) of £865, payable upon acceptance. All other paper types will incur an APC of £1325, also payable upon acceptance. In submitting your paper to SIGS, you agree to pay this fee if and when your article is accepted for publication.

Acknowledgements

We would like to gratefully acknowledge the support of many members of the Genomic Standards Consortium, the broader genomic science community, and those who have indicated their willingness to serve as editors, reviewers and contributors.

SIGS was founded with grants from the Office of the Vice President for Research and Graduate Studies at Michigan State University, the Michigan State University Foundation, and the US Department of Energy Biological and Environmental Research DE-FG02-08ER64707. The journal became self-supporting on October 1, 2011.

Standards in Genomic Sciences is indexed in: