April 15, 2014

Videos and photos from the Pallen Inaugural

Follow-up to Nothing in Microbiology makes Sense except in the Light of Evolution from The Microbial Underground

Live streamed version:




Slidecast version (better sound quality and no need to look at my ugly mug!)


Photos from the day here:

https://www.dropbox.com/sh/ao2xx91qrjl8rtu/2FijtEacI9


Nothing in Microbiology makes Sense except in the Light of Evolution

Writing about web page https://storify.com/mjpallen/palleninaugural

Here is online companion to my Inaugural Lecture.



April 14, 2014

False positives complicate ancient pathogen identifications, but only if you are naive and arrogant

Writing about web page http://www.biomedcentral.com/1756-0500/7/111

I came across this piece published in BMC Research Notes a few weeks ago, but have only just found time to comment on it: http://www.biomedcentral.com/1756-0500/7/111

  • False positives complicate ancient pathogen identifications using high-throughput shotgun sequencing BMC Research Notes 2014, 7:111 doi:10.1186/1756-0500-7-111 Michael G Campana (mcampana63@gmail.com) Nelly Robles García (nellym_robles@yahoo.com.mx) Frank J Rühli (frank.ruhli@anatom.uzh.ch) Noreen Tuross (tuross@fas.harvard.edu)

I cannot say that I am too happy with the style of the comments therein on our recent publication of metagenomic recovery of a TB genome from mummified remains (http://www.nejm.org/doi/full/10.1056/NEJMc1302295):

Additionally, a recent study by Chan and colleagues [54] claiming the identification of multiple strains of pathogenic tuberculosis (Mycobacterium tuberculosis) through non- targeted metagenomic sequencing has demonstrated insufficient analytical rigor to support their conclusions. The authors aligned their sequences against a single strain of pathogenic tuberculosis, but did not account for misalignments or environmental contamination with ubiquitous soil mycobacteria. Chan and colleagues’ data merit reanalysis with appropriate environmental controls. We recommend that the authors of these three studies demonstrate the veracity of their findings using a targeted capture approach and further bioinformatic analysis.

I guess working in Harvard makes people prone to academic arrogance! Perhaps there is also a whiff of sour grapes: they couldn't find any pathogens in their samples by metagenomics so we can't have done too! But dealing with the substance of the comments is easy enough. And ironically, I agree entirely with their earlier comments that these two papers are highly suspect:

http://www.ncbi.nlm.nih.gov/pubmed/23553074,21765907

(NB they are misreferenced in this paper).

OK, so let's take their points on our study one by one...

  • The authors aligned their sequences against a single strain of pathogenic tuberculosis, but did not account for misalignments or environmental contamination with ubiquitous soil mycobacteria.
  • Mycobacterium tuberculosis is a genetically monomorphic species, so there is not much to be gained by aligning against multiple strain genomes. But we did also compare the SNP profiles of our genomes with the recent close relative 7199/99. In the standard filtering that we employed, SNPS with low/high coverage and low mapping scores were removed, thus avoiding problems with repetitive DNA. The fact the majority of the mixed SNPs matched those of 7199/99 and H37Rv confirms that they are real. Plus, we did discuss the presence of environmental Actinobacteria in the metagenome in the Supplementary Material, where we report the presence of a Nocardia sp at around 200X coverage and of a relative of Thermobifidia fusca at around 10X coverage. We binned contigs according to Z score and coverage to avoid mixing up reads from different species. And we obtained deep and even coverage of the M. tuberculosis genome, which cannot be accounted for by misinterpretation of matches to environmental species. We have seen such spurious matches in some analyses, but they appear only when a low-stringency approach is applied to mapping and are obvious because they show spikey coverage limited to conserved regions (e.g. rRNA genes) rather than across the whole genome.
  • Chan and colleagues’ data merit reanalysis with appropriate environmental controls.
  • And what might these controls be? We have analysed a piece of lung tissue from mummified remains from a casket rather than the soil. As detailed in previous papers (http://www.ncbi.nlm.nih.gov/pubmed/?term=12576588+12541332+18399990), rigorous efforts were taken to avoid contamination during sampling and storage. We have never grown M. tuberculosis or sequenced TB genomes in the lab. Where else could the M. tuberculosis DNA have come from other than the sampled individual?
  • We recommend that the authors of these three studies demonstrate the veracity of their findings using a targeted capture approach and further bioinformatic analysis.
  • There is some faulty logic here. We are indeed contemplating using a capture-based approach to increase the sensitivity of our analyses, but this will do nothing for the speciificity of the approach, since any contaminating sequences which map to the pathogenic reference strains in silico are likely to be captured in vitro anyway because of their similarity to the bait. The answer is instead to increase the stringency of mapping and look for a consilience of results from multiple sources of evidence (e.g. evenness of coverage, SNPs that allow an assignment within an established clade), which we have done.

We are continuing to perform metagenomics on mummified material from Vác and on other historical samples and will be publishing additional studies in due course. This is an exciting area of research and one does has to be careful in interpretation, but our findings stand firm. Anyone who wants to repeat the analyses we reported in Chan et al is welcome to do so. The reads are available here:http://www.ncbi.nlm.nih.gov/sra?LinkName=pubmed_sra&from_uid=23863071

But I am afraid I agree with Campana et al when they critiicise the other two papers, Thèves et al because, inter alia, you cannot tell Shigella from E. coli by 16S and Khairat et al, because no sequence data is available in the public domain. Caveat lector! But enjoy the excitement of progress in this field (see also http://www.ncbi.nlm.nih.gov/pubmed/?term=24708363+23765279)


February 19, 2014

The Origin of Research Projects, with hat tip to Darwin

While mulling over the process of submitting research proposals, I came up with this:

"More individuals are born than can possibly survive. A grain in the balance will determine which individual shall live and which shall die,—which variety or species shall increase in number, and which shall decrease, or finally become extinct. As the individuals of the same species come in all respects into the closest competition with each other, the struggle will generally be most severe between them; it will be almost equally severe between the varieties of the same species, and next in severity between the species of the same genus. But the struggle will often be very severe between beings most remote in the scale of nature. The slightest advantage in one being, at any age or during any season, over those with which it comes into competition, or better adaptation in however slight a degree to the surrounding physical conditions, will turn the balance.”

Origin of Species, 1859, Charles Darwin

"More proposals are submitted than can possibly be funded. A grain in the balance will determine which proposal shall be funded and which shall die,—which research group or institution shall increase in number, and which shall decrease, or finally become extinct. As the individuals in the same research area come in all respects into the closest competition with each other, the struggle will generally be most severe between them; it will be almost equally severe between researchers in the same sub-discipline, and next in severity between sub-disciplines of the same discipline. But the struggle will often be very severe between proposals most remote in the type of research. The slightest advantage in one proposal, at any stage or during any meeting, over those with which it comes into competition, or better adaptation in however slight a degree to the surrounding political conditions, will turn the balance.”

Origin of Research Projects, 2014, Mark Pallen


February 07, 2014

Bugs that may not only kill bugs

Unlike most normal well-adjusted folk my childhood obsession with "bugs" never really went away. Now I find myself still playing with insects for what on the surface appear to be very different reasons, but underneath are still the same…basic curiosity. This soon evolved into an interest in what diseases (another type of “bug”) they can suffer from. The more research I did into this the more I came to understand that they are no so different from humans. They have very similar immune systems and can suffer very similar diseases. In fact it wasn’t long before I began to question if we could catch the same diseases as them?

picture 1

I am not talking here about insects acting as vectors for human diseases, I mean can bacterial pathogens cause equivalent diseases in both insects and people?

I am particularly interested in a common insect pathogen, which goes by the name of Photorhabdus. This bacterium, which is closely related to plague bacteria (Yersinia pestis), is carried around in the soil by a symbiont nematode worm calledHeterorhabditis. This worm burrows into insects and then regurgitates the bacteria, which then rapidly kill the insect. The bacteria produce a large number of drug-like molecules, which prevent any other organisms from infecting the insect, and also sacrifice themselves as a food source for the reproducing worms. The Photorhabduscan control the development of the nematode (somehow) and re-associate with it before they leave the cadaver in search of new insect prey.

This fascinating life cycle provides many interesting aspects to study, such as symbiosis, drug-discovery, immune evasion and toxin biology. Not to mention they glow as the only terrestrial bioluminescent bacterium. Who could not be interested in a glowing disease?

picture 2picture 3

In fact they are so good at killing insects that humans have used them for decades as a biological pest control agent to protect our orchards, greenhouses and golf courses. What has particularly attracted my attention however is that certain strains of this ubiquitous insect pathogen have been found causing a very unpleasant disease in humans!

picture 4picture 5

The Photorhabdus genus contains three distinct species. Two of which, P. luminescensand P. temperatacan only infect insects while the third, P. asymbiotica can also infect people. Genetically there is very little difference in these three species suggesting that the “jump” into humans required very little genetic change.

picture 6

Our recent studies into how P. asymbiotica can pull off the trick of causing disease in insects and humans has given us some surprising results. The main trick appears to be that they have evolved the ability to tolerate growth at 37°C, which kills the insect-restricted strains.

However, at 37°C P. asymbioticaloses the ability to use many of the nutrients that it can otherwise use at the lower, insect host temperature. We may naively assume that this would make it a “weaker” pathogen. However there are several examples of other bacterial pathogens of humans that appear to have undergone significant “loss of functions” when compared to their close relatives. On the other hand Photorhabdusappears to up-regulate many of the same toxins it would use against insect hosts when exposed to the human body temperature of 37°C. Same tools, different job?

As pathogens of insects far outnumber those of humans, they provide a massive reservoir of potential future human disease causing agents. Depending on how minor the genetic changes have been that have allowedP. asymbiotica to develop this “split personality” will determine how seriously we should take this threat.


January 31, 2014

The ‘Other’ side

I have just recently moved back to the UK – from beautiful, warm Italy. Well, personally it was a much bigger move- from Industry back to Academia.

Several people have been curious about my experience (while some just think I am plain crazy to give up a Company position), so I thought I would share some of the things I learnt from the ‘other’ side in this blog!

When I first started, I still remember being amazed by the amount of careful planning and effort that goes into getting something from a research lab to a product. The attention to details, the endless number of tests and approvals, and the complexity of the entire process is overwhelming. The large teams of experts, all believing and working towards the same goal, is really quite something.

Setting objectives, having real deadlines, were all foreign to me- but I did not really dislike this part. In fact I felt that these actually make you use your time more carefully. Drawing out personal objectives and reviewing them regularly is actually quite productive. Stopping projects that are going nowhere is a sad but good thing, and the sooner the better.

I think all these ‘company’ routines could actually help an academic, as we quite often tend to lose track of things! I think it would certainly not hurt to say – lets try this for X months, if it does not work by then we move on to something different. Indeed, most times we get so stuck onto the idea that we just cannot give it up!

The big downside of working for a company: you lose your intellectual freedom. You could not just call your good friend, the expert, and discuss your work or ask him for a reagent. You also lose the flexibility and freedom that academic jobs offer. You could not try that cool experiment you thought of in the middle of the night. You would also lose your ‘individuality’ to some extent, as you are usually a spokesperson for the company.

So if you like being part of a team focused on creating a product that will actually reach real people, industry is perfect. You could still do good science, and of course, not worry about funding.

However, I do think it is a great idea for academic scientists who would like their research to be ‘useful’ to spend some time working in the commercial sector. Although there is a much better industry-academia crosstalk now than a decade ago, I think there are a lot of gaps that are not very evident from the outside, which could be better filled in by academicians.


January 19, 2014

Of tweeps and ivory towers

New year, new calendar, time to think about what conferences I might attend in 2014. As a newly-minted academic, the choice is dizzying. Of course, it would help to have some results to present and my research project is only just getting off the ground, but there are other reasons to go, to see and be seen.


When I was a mere PhD student, there was guidance available on how many conferences I was expected to attend and when to present; as a member of academic staff, there seems to be no guidance at all. Being invited to give a talk remains a far-off possibility.

This is an exciting time in microbiology and particularly antimicrobial resistance research, with more media coverage and funding streams in the past two years than the preceding 20, so there is a lot to talk about.


So far, events which have definite appeal are as diverse as Genomics 2014, the Oxford Bone Infection Conference and the SMBE satellite meeting on reticulated microbial evolution. This is without even mentioning the alphabet soup which signifies the big players in infection research meetings: ECCMID, ICAAC, ASM, FIS, SGM and more.



Key themes in this year


Key themes in this year's infection conferences


The problem is obvious; even if none of these events overlap with each other, I couldn't possibly attend them all, not while carrying on the "day job" in research, teaching and clinical microbiology. There's also the not-so-small matter of work-life balance.


Do I need to go to conferences? The world wide web draws talks and posters to my laptop; I have discovered that almost all the talks from the Federation of Infection Societies' (FIS) 2013 are available freely as videos (here) and many proceedings, including the excellent Beatles and Bioinformatics, are posted on YouTube (link); in that case, they were actually streamed live, so I could watch and listen from the comfort of my office in real time, with no train fares or hotel bills to pay and no difficult childcare juggling. A late-comer to the twittersphere, I have discovered its utility in alerting me to new papers and research projects, traditionally reasons for attending meetings.


coventry_skyline_version_2.jpg


Not quite ivory: the three towers of Coventry, home to the University of Warwick


If I can get all this without leaving my ivory tower, should I go at all? I find myself asking for a cost-benefit analysis of conference attendance. (I may have spent too much time reading clinical guidelines, where cost-benefit analysis is always implicit and usually explicit). Widely reproduced opinions in favour claim that conference attendance is good for the CV, that networking is beneficial and that attending talks outside one's specific subject area broadens knowledge.


What about the costs? Time, travel, accommodation, environmental impact, childcare, family impact... Beyond the personal implications, it has been suggested that conferences also promote presentation of posters containing poor-quality research, which may never make it to peer-reviewed publication, facilitate branding of particular scientific cadres and promote "group-think", whereby the same opinion leaders appear repeatedly on different platforms and the same ideas circulate (1). "Tribes" may form of researchers who meet repeatedly at conferences and become exclusive clubs, inviting each other to talk at yet more conferences and potentially sit on the same grant committees awarding further funding to scientists with more invited talks as a marker of success. Less mobile researchers, those with significant health issues or family commitments which preclude large quantities of travel, or simply those without the funding to subsidise this kind of activity are disadvantaged by their non-attendance.


Putting a price on each of these costs and benefits is impossible and will, of course, be different for every researcher. Which leads me no closer to a solution as to which and how many meetings I should attend and indeed begs the question of whether the whole model of large multi-national medical and scientific conferences is broken and a new way of facilitating scientific discourse should be developed.


1. Are Medical Conferences Useful? And for Whom?

John P. A. Ioannidis, MD, DSc

JAMA. 2012;307(12):1257-1258


Follow me on Twitter @ilovechocagar


This post is also found on my blog "Microbiological Musings" at

http://ilovechocolateagar.blogspot.co.uk/


January 10, 2014

What coverage is needed for “good” assembly of a bacterial genome ?

I have been asked this question several times recently, along with "how much data is required for genome assembly and will more data give a better assembly ? "

I looked at these questions in a little more detail. To go about answering this I used some data from a bacterium that had been previously sequenced and closed using shotgun cloning and sanger sequencing. It has since been re-sequenced using illumina PE (2 x 250bp). The average coverage was 150x . The genome size of the bacterium is ~ 2.36 Mbp.

I first randomly subsampled this total read pool to produce sub-samples that would produce an average coverage from 15 - 150x. Each sample was assembled de novo with SPAdes and basic assembly properties assessed with QUAST.

Basic assembly properties are shown below

N50 -The length for which all contigs of that length or longer contains at least half of the total of the lengths of the contigs


graph_1

As seen above once the coverage is above 15x there is a big jump in the N50 from 181 kb to 317 kb

To further narrow down the particular point at which the N50 increased, I produced samples that would have coverage of 15, 22.5, 24, 25.5, 27 and 28.5x coverage. Again they were assembled with SPAdes

The results of this are below and show that as coverage increases so does the N50 value upto 27x coverage. Above this coverage there is no increase in the N50

graph_2

So if assessing a “good” assembly purely on N50, which is not a great idea, then coverage of 27x is giving the same answer as 150x and not providing any additional data

Looking at other parameters such as number of contigs longer than 1kb, also reveals that above 27x coverage there is no decrease in the number of contigs longer than 1kb.

graph3a_andrew_millard

Whilst N50 will give an indication of the size of contigs, it does not inform on the quality of the assembly. To do this again QUAST was used to compare back against the original reference genome. A range of metrics can be produced, only a few are detailed below. With all data below compared back to the known reference sequence.

First looking at the Genome Fraction of the "de novo assembly" compared to reference.

From QUAST Genome fraction (%) is defined as:the percentage of aligned bases in the reference. A base in the reference is aligned if there is at least one contig with at least one alignment to this base.

graph4_Andrew_millard

Again above 27x coverage the % was stable and did not increase with coverage – even at the lowest 15x coverage 99.75 % of the genome was still present.

Further looking at the % of complete genes that were found in the de novo assembly compared to the reference genome.

graph5_andrew_millard

Above 24x coverage the result again plateaued. The small variation in % of complete genes, is due to 1 gene that is found in some assemblies and not others.

As can be seen below there are a number of genes that are only partialy complete (compared to the reference)

graph_6_andrew_millard

Increasing the fold coverage above 24x will not decrease the number of partial genes that are consistently obtained for an assembly.

For this particular bacterium with a small genome size ( 2.36 Mb) then there is no advantage is sequencing to a higher depth than 27x coverage. As the number of contigs >1kb in length does not decrease and the number of complete genes and % of genome that is mapped does not continue increasing with greater coverage .

So in answer to the original questions increasing the amount of sequene will produce a better assembly, but only upto a certain point.

How much sequence is needed will vary with the bacterium that is being sequenced. For this bacterium the 27x gives the same answer as 150x. But this was only calculated after 150x coverage was obtained. However, the use of simple metrics from programs such as QUAST and sub sampling of existing data allows some indication wether increasing the amount of data will produce a better assembly






December 16, 2013

Engaging the future – scientists in the making

I was recently invited to get out of the office to attend the Science Fair at Outwood Grange Academy, Leeds by Rebecca Simmonds who is one of the highly enthusiastic science teachers there. I was really excited about what I was going to see as two of the students from last years competition are currently in Guangdong Provence in China taking part in the International Teenagers Science Festival (updates can be found here http://www.grange.outwood.com/). The students were divided into three groups depending on the key-stage they belonged to. There were over 30 projects that ranged from how different brands of toothpaste effect the amount of bacterial colonies formed after brushing to the effect of different household compounds on UV protection to the influence of colour on garden bird feeding patterns to the consequence of conkers being present on spiders activities. I was highly impressed by the students enthusiasm, the amount of effort they had gone to, to complete their projects and their knowledge on the particular topics they had studied. It was clear that they had, had a great time working on their projects, which were clearly supported by an excellent science department with teachers passionate about seeing their students succeed.

After my fellow judges and I had finished our judging the four of us that had come from industry and academia gave short talks about our careers so far. I think it helped that the week this took place was also antibiotics awareness week and it was all over the press. But I was really blown away about how much interest there was in how antibiotics and bacteria function, and how resistance arises. I think if the children and parents hadn’t been stopped I might have been there all night!

There is much talk about the need to engage school leavers to enter into science careers. Many of the children I spoke to were very enthusiastic about following careers in medicine, forensic science and biology. Maybe by allowing all school children to take part in science fairs in such a hands on way we will be able to fire their enthusiasm, encouraging them to follow a career that those in it greatly enjoy.

I very much look forward to being invited back to see what other exciting topics the students will investigate.


September 28, 2013

Our Inaugural Symposium storified and glorified!


Search this blog

Blog archive

Loading…

Most recent comments

  • Hi Shilp, glad that you found it useful . I used seqtk sample. So if I had 100 reads for 90% seqtk s… by Andrew Millard on this entry
  • Dear Andrew, Fantastic post..! very informative.. I would like to know how did you down–sample your … by Shilp Purohit on this entry
RSS2.0 Atom
Not signed in
Sign in

Powered by BlogBuilder
© MMXIV