All 1 entries tagged Academic Arrogance

No other Warwick Blogs use the tag Academic Arrogance on entries | View entries tagged Academic Arrogance at Technorati | There are no images tagged Academic Arrogance on this blog

April 14, 2014

False positives complicate ancient pathogen identifications, but only if you are naive and arrogant

Writing about web page http://www.biomedcentral.com/1756-0500/7/111

I came across this piece published in BMC Research Notes a few weeks ago, but have only just found time to comment on it: http://www.biomedcentral.com/1756-0500/7/111

  • False positives complicate ancient pathogen identifications using high-throughput shotgun sequencing BMC Research Notes 2014, 7:111 doi:10.1186/1756-0500-7-111 Michael G Campana (mcampana63@gmail.com) Nelly Robles García (nellym_robles@yahoo.com.mx) Frank J Rühli (frank.ruhli@anatom.uzh.ch) Noreen Tuross (tuross@fas.harvard.edu)

I cannot say that I am too happy with the style of the comments therein on our recent publication of metagenomic recovery of a TB genome from mummified remains (http://www.nejm.org/doi/full/10.1056/NEJMc1302295):

Additionally, a recent study by Chan and colleagues [54] claiming the identification of multiple strains of pathogenic tuberculosis (Mycobacterium tuberculosis) through non- targeted metagenomic sequencing has demonstrated insufficient analytical rigor to support their conclusions. The authors aligned their sequences against a single strain of pathogenic tuberculosis, but did not account for misalignments or environmental contamination with ubiquitous soil mycobacteria. Chan and colleagues’ data merit reanalysis with appropriate environmental controls. We recommend that the authors of these three studies demonstrate the veracity of their findings using a targeted capture approach and further bioinformatic analysis.

I guess working in Harvard makes people prone to academic arrogance! Perhaps there is also a whiff of sour grapes: they couldn't find any pathogens in their samples by metagenomics so we can't have done too! But dealing with the substance of the comments is easy enough. And ironically, I agree entirely with their earlier comments that these two papers are highly suspect:

http://www.ncbi.nlm.nih.gov/pubmed/23553074,21765907

(NB they are misreferenced in this paper).

OK, so let's take their points on our study one by one...

  • The authors aligned their sequences against a single strain of pathogenic tuberculosis, but did not account for misalignments or environmental contamination with ubiquitous soil mycobacteria.
  • Mycobacterium tuberculosis is a genetically monomorphic species, so there is not much to be gained by aligning against multiple strain genomes. But we did also compare the SNP profiles of our genomes with the recent close relative 7199/99. In the standard filtering that we employed, SNPS with low/high coverage and low mapping scores were removed, thus avoiding problems with repetitive DNA. The fact the majority of the mixed SNPs matched those of 7199/99 and H37Rv confirms that they are real. Plus, we did discuss the presence of environmental Actinobacteria in the metagenome in the Supplementary Material, where we report the presence of a Nocardia sp at around 200X coverage and of a relative of Thermobifidia fusca at around 10X coverage. We binned contigs according to Z score and coverage to avoid mixing up reads from different species. And we obtained deep and even coverage of the M. tuberculosis genome, which cannot be accounted for by misinterpretation of matches to environmental species. We have seen such spurious matches in some analyses, but they appear only when a low-stringency approach is applied to mapping and are obvious because they show spikey coverage limited to conserved regions (e.g. rRNA genes) rather than across the whole genome.
  • Chan and colleagues’ data merit reanalysis with appropriate environmental controls.
  • And what might these controls be? We have analysed a piece of lung tissue from mummified remains from a casket rather than the soil. As detailed in previous papers (http://www.ncbi.nlm.nih.gov/pubmed/?term=12576588+12541332+18399990), rigorous efforts were taken to avoid contamination during sampling and storage. We have never grown M. tuberculosis or sequenced TB genomes in the lab. Where else could the M. tuberculosis DNA have come from other than the sampled individual?
  • We recommend that the authors of these three studies demonstrate the veracity of their findings using a targeted capture approach and further bioinformatic analysis.
  • There is some faulty logic here. We are indeed contemplating using a capture-based approach to increase the sensitivity of our analyses, but this will do nothing for the speciificity of the approach, since any contaminating sequences which map to the pathogenic reference strains in silico are likely to be captured in vitro anyway because of their similarity to the bait. The answer is instead to increase the stringency of mapping and look for a consilience of results from multiple sources of evidence (e.g. evenness of coverage, SNPs that allow an assignment within an established clade), which we have done.

We are continuing to perform metagenomics on mummified material from Vác and on other historical samples and will be publishing additional studies in due course. This is an exciting area of research and one does has to be careful in interpretation, but our findings stand firm. Anyone who wants to repeat the analyses we reported in Chan et al is welcome to do so. The reads are available here:http://www.ncbi.nlm.nih.gov/sra?LinkName=pubmed_sra&from_uid=23863071

But I am afraid I agree with Campana et al when they critiicise the other two papers, Thèves et al because, inter alia, you cannot tell Shigella from E. coli by 16S and Khairat et al, because no sequence data is available in the public domain. Caveat lector! But enjoy the excitement of progress in this field (see also http://www.ncbi.nlm.nih.gov/pubmed/?term=24708363+23765279)


Search this blog

Blog archive

Loading…

Most recent comments

  • I have just seen this paper on the ~7–thousand–year auroochs genome: http://www.genomebiology.com/20… by Mark Pallen on this entry
  • Hi Chris, You are right that there is nothing implicit in being Open Access that guarantees a right … by Mark Pallen on this entry
  • Good to see it on biorxiv. I didn't fully follow the second point about elife, your criticism seems … by Chris Keene on this entry
  • Congratulations to Professor Achtman. by on this entry
  • Hi Shilp, glad that you found it useful . I used seqtk sample. So if I had 100 reads for 90% seqtk s… by Andrew Millard on this entry
RSS2.0 Atom
Not signed in
Sign in

Powered by BlogBuilder
© MMXX