Q: Why is a species sometimes represented in the TYGS phylogenies by more than one strain deposit?
For some species the TYGS database contains the genome sequences of more than one strain deposit (e.g. ATCC and DSM). Now, if a user-provided
genome sequence results in a close match with such a species all strain deposits of that species are usually included in the TYGS result.
The main reason is that the scientific literature reports rare cases in which such strains unexpectedly differ to a considerable extent thus indicating
a strain confusion or contamination (please find an example here). That way, the TYGS is an important tool to uncover such irregularities.
Apart from that, we think that having more than one strain deposit of the same species included in the dataset is at most a cosmetic issue, not a scientific problem. If you still want to remove such "duplicates", you have the option to download the trees in Newick format and remove them. We however advise against post-manipulation of ready-made results.