Showing posts with label TGS. Show all posts
Showing posts with label TGS. Show all posts

Thursday, December 11, 2025

JITMM2025 presentation on Blastocystis

Hi,

A few days back, I returned from Bangkok after attending a great JITMM2025 conference. 

I promised to upload my presentation on the blog, so here goes:

power point presentation

So, what is it about? Well, with the use of the most recent DNA sequencing technologies, we have the chance to dig further and maybe even more precisely into the molecular epidemiology of Blastocystis. However, new advances are accompanied by new challenges. Here, I highlight some of them, providing concrete examples for each. 

I don't do this to put colleagues on the spot; we all make mistakes. I've made many mistakes myself, and we should learn from them and be critical to our data and that of others. 

I realised that the notes for the presentation are not visible, so you can read them underneath:

Many thanks for watching.

 

Bw Rune

 

Slide 1 Notes:
I would like to thank the organizing committee for including this session on Blastocystis and for inviting me to do this talk.

Slide 2 Notes:
This talk is based primarily on these three articles that we published over the past five years.

Slide 3 Notes:
And the key message of the talk is that widespread misidentification of Blastocystis DNA in GenBank and in articles leads to inflated diversity estimates, incorrect subtyping, and misleading conclusions about host specificity and zoonotic potential.

Slide 4 Notes:
Blastocystis is an organism quite closely related to the potato pathogen, Phytophtora, and the only member of the very heterogeneous group of Stramenopiles known to colonise the human intestinal tract. It belongs to Opalinata, a diverse assemblage of variously modified unicellular eukaryotes.

Importantly, it’s probably the easiest parasite to culture from human stool. A positive culture from a PCR-positive study individual could support and be indicative of active colonisation rather than contamination or mechanical transport through the digestive system

Slide 5 Notes:
It is an organism that colonises the gut of probably more than 1 billion people globally, being the most common single-celled gut parasite of humans. It is also seen in many types of mammals, especially herbivores, and birds. Even insects have tested positive for Blastocystis. As far as I know, it has no invasive properties, and it has never been reported in outbreaks. 
It’s physiological and ecological properties have been incompletely described, and its impact on human health and disease is only partly known. However, recent studies link Blastocystis to high gut microbial diversity, beneficial gut microbes, and favourable cardiometabolic parameters.
GenBank contains >15,000 ‘Blastocystis’ small subunit rDNA sequences which form the basis for Blastocystis taxonomy – up to at least 2% of these are not Blastocystis! I’ll give you examples.


Slide 6 Notes:
But first a bit of history. Before 2007, it had been established by especially Graham Clark but also groups in Germany, France and Japan that Blastocystis – despite the lack of differential morphological hallmarks – exhibits extensive genetic diversity, and it had also been established that Blastocystis could be found in not only humans, but also a variety of non-human hosts. However, the name ‘Blastocystis hominis’ that had been used to refer to Blastocystis found in humans turned out to be invalid, since some of the genetic variants of Blastocystis found in humans could also be found in non-human hosts. Therefore, the term Blastocystis hominis was deemed invalid due to technical reasons, and the subtype system was introduced instead, based on recent observations from molecular studies. In 2007, we had identified 9 subtypes from humans, other mammals and birds.


Slide 7 Notes:
An why do we bother with subtyping? We subtype Blastocystis to gain a better understanding of its taxonomy and epidemiology. Maybe different subtypes are associated with health … or disease? Maybe the distribution of subtypes has a geographical component? Which subtypes are host specific, and which ones are shared among different types of hosts – which ones are potentially zoonotic? Research advances rely on our ability to communicate data based on a harmonized and robust terminology. 
Combining host specificity data with data on genetic diversity can help us develop a biologically meaningful classical binomial taxonomy for Blastocystis and help us confirm that, in case there is host specificity, it is likely that we are looking at active colonisation rather than random, non-colonising strains from contamination.


Slide 8 Notes:
In 2009, we discovered ST10 in cattle, and by 2013, no less than 17 subtypes had been identified. In 2013, we were still in the pre-NGS era, but we started accepting that we could identify new subtypes based on DNA sequence data only (i.e., at that time we found it not be necessary to be able to demonstrate any new subtype in culture to elevate a sufficiently divergent sequence to a novel subtype). 


Slide 9 Notes:
Here in 2025, we are looking at almost 50 subtypes, and we’re still counting, it seems. However, a scenario is developing where the traditional way of subtyping and identifying subtypes are challenged. Fake subtypes have been introduced, and about 2% of all Blastocystis sequences in GenBank are not Blastocystis, which has led to second-generation mis-identification. These are regular errors that hamper scientific progress.

Slide 10 Notes:
And how do we define a novel subtype? Well, until 2023, it had been customary to use at least 4% genetic diversity across 18S sequences to differentiate between subtypes, which means that if two sequences diverge by 4% or more, they were likely belonging to two different subtypes.
It's important to say that until now, it has been customary to sequence only part of the 18S to be able to call a subtype; however, the more sequence material, the better resolution, and the stronger the phylogenetic inferences, since variation exists in different parts of the gene depending on subtype. The barcode region, however, is sufficient to distinguish most subtypes, even today.
In 2023, there was a suggestion to use 2% instead of 4% as a criterion for distinguishing subtypes; however, that would lead to many more subtypes and spoil our attempts to obtain and convey an understanding of the epidemiology of Blastocystis, including host specificity. It would simply dilute the trends observed at the 4% level.


Slide 11 Notes:
In 2017, a team from China introduced five new subtypes. After sequence review, it turned out that only one of these subtypes could be validated as a true Blastocystis subtype. The four other sequences were sequence chimeras or other sequence artefacts. A DNA sequence chimera is an artificial hybrid sequence formed when two or more unrelated DNA fragments are accidentally joined together during PCR amplification or sequencing workflows.
Because it does not exist in nature, a chimera misrepresents the true genetic content of the sample.

Slide 12 Notes:
It has turned out that some sequences referred to as Blastocystis are actually sequences of plants such as those listed here. An even bigger problem, however, is that no less than at least 67 sequences in GenBank are listed as Blastocystis, while in fact being sequences of yeasts. This is very unfortunate, since yeasts are commonly found in the human gut, and therefore, second-generation mis-identification can happen often in the hands of less experienced colleagues. 

Slide 13 Notes:
And this already happened often. I will give you some examples. In 2020, Liao and collagues surveyed parasites in dogs in the Chinese city of Guangzhou. They claimed to have found Blastocystis in 35 of 651 canine faecal samples. Well… 

Slide 14 Notes:
This is Figure 3 in the publication, and, intuitively, you’d think that this is a typical phylogentic tree of Blastocystis. But you’d become suspicious when you see ST3 clustering with ST1 instead of ST4 and ST8. So, it’s worthwhile taking a closer look into the sequences. All the MK sequences are reference sequences from GenBank, and it says that they are Blastocystis, but they are not. They are yeast DNA sequences, which can easily be verified by simple BLAST queries. And the sequences indicated by filled black circles are study sequences that cluster with the yeast sequences named Blastocystis. 


Slide 15 Notes:
So, in fact, this part of the tree is the only part that has Blastocystis sequences in it. 

Slide 16 Notes:
So, 11 of these sequences are not Blastocystis. And, after scrutinizing the data in this study, it turns out that the authors had identified only one Blastocystis sequence across 651 dogs. And that was a ST10 sequence that could stem from ingestion of or contamination with ruminant faeces, since ST10 is a ruminant-associated subtype. This is an example of pseudoscience, which undermines science. 

Slide 17 Notes:
This is another example. It’s a study by Can and colleagues, appearing in the Polish Journal of Veterinary Sciences in 2021, claiming to have found Blastocystis in stray cats in Turkey. 

Slide 18 Notes:
This is Figure 3 from the paper. At the bottom, you see a group comprising 3 reference sequences and seven study sequences clustering with 99% bootstrap. It’s only that the three reference sequences are yeast sequences, which makes all the seven study sequences yeast. So there was no evidence of Blastocystis in any of the 465 cats sampled. But the title of the article said something completely different. Please also note that they had ST4 appearing in two different places in the tree!

Slide 19 Notes:
A study appeared in the MDPI-based journal Animals by Liang and colleagues reporting on Blastocystis in cattle in North China. 

Slide 20 Notes:
This is Figure 3 from the paper. It has some true Blastocystis sequences, most of which are ST10, which is quite as expected. No problems with that. However, you can see that ST1 clusters towards the very base of the tree; usually it clusters with subtype 2. So, this gives us a reason to be suspicious. Upon inspection, you would notice that the reference sequences MN696798 and MT645672 referred to as Blastocystis, are not Blastocystis but yeast DNA sequences. This means that the four study sequences seen at the base of the tree are not Blastocystis. Please note ON110353, which sits on a long branch; this could be a sequence artefact that should not have been included in the analysis; subtypes in the apical part of Blastocystis phylogenetic trees rarely sit on long branches. 

Slide 21 Notes:
There are several other examples. And the major issue here is that such errors make us underestimate the host specificity of Blastocystis and blur the epidemiology of Blastocystis. 

Slide 22 Notes:
There are many more sequences in GenBank claimed to be Blastocystis that may be associated with error, and I’m happy to discuss that with those who are interested. 

Slide 23 Notes:
Here, I’ve listed some of the reasons why mis-annotation might happen. In the interest of time, I won’t go through them. I will post my presentation on the Blastocystis blog, so you can access the slides from there.

Slide 24 Notes:
So, at least 200 DNA sequences submitted to GenBank and used across more than 40 published studies have been named ‘Blastocystis’ although they are seqeunces of other genera or regular artefacts. Misidentification of Blastocystis is commonly observed, affecting subtype nomenclature, creating confusion, and leading to false biological conclusions.

Slide 25 Notes:
So, how do we proceed from here? It’s important to use integrative approaches. For instance, results from the use of NGS/TGS to detect and differentiate Blastocystis can be informed by results from short-term in-vitro culture. Sanger sequencing can be used to verify new subtypes identified by NGS/TGS. Experts should be consulted in cases of doubt, BLAST queries should be validated by colleagues, and a curated database should be used instead of GenBank.

Slide 26 Notes:
Indeed, the BlastoDB is a curated database in development under the Blastocystis One Health COST Action and is hosted at the University of Kent, the UK 


Slide 27 Notes:
Anastasios Tsaousis is coordinating it with a small steering group covering genomics, subtype nomenclature, and database infrastructure (Rune Stensvold, Eleni Gentekaki, and Andrew Roger).
It will function as a curated reference resource rather than just another “dumping ground”: there will be clear inclusion criteria, consistent subtype calls, and proper metadata (host, geography, methods, etc.).


Slide 28 Notes:
Crucial to the development, maintenance and expansion of the database is the contribution of sequence data (genes, genomes, etc.) and Blastocystis strains (live cultures and DNAs). We are actively looking for data and strains that can populate the BlastoDB and a BioBank that are being established, so please let us know if you’re interested to help. Samples can be sent for free at the moment. Let us help each other to increase the quality of Blastocystis research!


Slide 29 Notes:
I want to thank you all very much for listening, and I want to extend special thanks to the Department of Protozoology at the Mahidol University. Thank you.