Extended 16S rRNA gene sequence analysis now available from main TYGS result page
[CHANGED] The extended SSU analysis will now be found on the main TYGS result page and not longer on the subpage of the 16S GBDP phylogeny.
Reconstruction of proteome-based GBDP trees
[ADDED] For very diverse datasets of strains, the average branch support, even of the genome-based phylogeny, might be too low which is not unlikely for such datasets. An optional proteome-based GBDP analysis will become available on user request if the dataset is not too large (< 30 strains) and the average branch support of the genome-based tree is smaller than 50%.
[CHANGED] A routine was replaced in the coverage algorithm of the GBDP core, with the newer one being a bit more complex but better able to deal with highly similar genome sequences that significantly differ in length and are highly redundant e.g. due to sequencing or assembly artifacts.
Splitting of user requests
[ADDED] Sometimes job submissions contain user genomes or metagenomic bins that are only remotely related, if at all. But a single TYGS analysis including all of these genomes at once would then result in a 'bloated' user request analysis when being subjected to the automatic determination of closest type strain neighbors. In such cases the TYGS now splits your request into two or more sub requests reflecting groups of genomic relatedness. the job confirmation page now also holds an overview of the status of these forked requests and a direct link to the individual result pages. This feature further improves your genome-based taxonomic analyses by finding an even more suitable type strain selection.
Compact PDF report of your individual TYGS result page
[ADDED] Since December 2019 the TYGS also offers the download of a nice publication-ready PDF report which can, in principle, be added to a paper as a supplement.
Genome-based approach to determine closest type strain genomes
[CHANGED] The most closely related type-strain genomes are now also determined independently of 16S rRNA gene sequences. In addition to the ten closest type-strain genomes determined via the smallest 16S rRNA sequence distances, the TYGS now also calculates intergenomic MASH distances (PubMed ID: 27323842), a fast approximation of intergenomic relatedness, between each uploaded genome sequence and the entire set of type strain genomes contained in the TYGS database. The set of type-strain genomes with the smallest MASH distances is combined with those having the smallest 16S rRNA gene distances. This approach ensures that even if the user genome sequence does not contain a 16S rRNA gene and even in organismal groups in which the 16S rRNA gene sequence does not well resolve the pairwise relationships, the most closely related type strains are chosen. Note that the usage of 16S rRNA gene sequences remains relevant for detecting closely related type strains that have not yet been genome-sequenced.
Tree export to PNG format
[FIXED] The legend text is no longer cut off at the top when exporting trees to PNG format.
Quality criteria for the import of reference genomes
[CHANGED] New type(-strain) reference genome sequences are only included in the TYGS database, if certain quality criteria are met. The latter were recently adjusted, i.e., now genomes with >500 contigs are accepted if the annotation contains a 16S rRNA gene sequence. When introduced this change resulted in the addition of only few dozens of new genome sequences.