Type Strain Genome Server

Q: What do the three different digital DDH formulas (d0,d4,d6) mean?

Table 3 of the TYGS result page contains the pairwise dDDH values between your user genomes and the selected type-strain genomes. The dDDH values are provided along with their confidence intervals (C.I.) for the three different GBDP (Genome BLAST Distance Phylogeny) formulas:

  • formula d0 (a.k.a. GGDC formula 1): length of all HSPs divided by total genome length
  • formula d4 (a.k.a. GGDC formula 2): sum of all identities found in HSPs divided by overall HSP length
  • formula d6 (a.k.a. GGDC formula 3): sum of all identities found in HSPs divided by total genome length

While formulae d0 and d6 reflect the (dis-)similarity of the underlying genomes in terms of gene-content, formula d4 reflects the proportion of sequence identity within the homologous parts of the underlying genomes. Moreover, formula d4 is independent of genome length and is thus robust against the use of incomplete draft genomes.

More info on these formulas and the underlying GBDP method is found in the literature.