April 20, 2015

Grand opening of our MRC CLIMB microbial bioinformatics facility at Warwick

On Friday 17th April, we had great fun holding an opening event for our £8.4m MRC-funded CLIMB project and facility at Warwick Medical School.

[For those of you who wish to know more about the CLIMB (cloud infrastructure for microbial bioinformatics) project, take a look at our web site: http://www.climb.ac.uk or watch Tom Connor's talk on YouTube: https://www.youtube.com/watch?v=fZhw5FMLUjM.]

You can access a YouTube video of the first half of the event here:

The event kicked off with me providing a brief introduction to the project and stressing the achievements so far:

  • spending millions of pounds on bioinformatics infrastructure within a single financial year (£3.7m on computers across the four participating universities; £0.7m on building work at Warwick and >£1m at Swansea)
  • getting all the procurement, purchase orders, invoices, building etc through our university systems
  • recruiting excellent support staff and academic staff to the project
  • getting the building work finished (an end to all that noise!)

I then introduced our three guest speakers:

  • Stanley Falkow, godfather of the field of bacterial pathogenesis, skyping in from Stanford and immortalised in our glasswork with his quote "Never met a microbe I didn't like"
  • Randal Keynes, great-great-grandson of Charles Darwin
  • Jon Chase, aka OortKuiper, science rapper.

Stan appeared on our huge 95 inch screen like Big Brother in the iconic Apple ad!

Stan

Stan has kindly provided a summary of his words of support for the project and event:

I am glad to add to the chorus of those who celebrate the opening of the CLIMB, the Cloud Infrastructure for Microbial Bioinformatics at Warwick.

As you know It’s an admirable enterprise that includes a consortium with Universities at Birmingham, Cardiff and Swansea,to permit a public as well as a private Internet resource, available to all in the UK

How I envy you! You are the generation of scientists who have availabile to you software, data stotrage and bioinformatic expertise to be able to track and understand the epidemics of the past, as well as the contemporary realties of the dynamics of infection and transmission of infectious diseases. You have the ability to examine the genome of the offending microbe as well as the host - including us, the human.

I am reminded by the presence of Professor Mark Achtman in this gathering at a time in 1978 when he, as well as Gordon Dougan, who was until recently Group Leader in Pathogen Genomics at the Sanger Centre, were visitors in my laboratory in the University of Washington. We were at that time able to sequence 75 base pairs a day, and analyze the results on a Radio shack computer with 16 K of RAM using a program I wrote in BASIC. I remember the thrill of my first ATG start codon! How far we have come in the past 37 years!

I can only dream about what you can accomplish in the next decade. Would that I will be here to learn of you accomplishments - of which, I may say, you can only dream and speculate about now! I always say The good old days are now!

Good luck and God speed in the adventure that awaits you.

Randal then said a few words, linking our efforts to the legacy of Darwin:

Introduction

I’m here today because when Mark Pallen told me about CLIMB, I was fascinated and impressed by the boldness of the plan - managing all these fresh kinds of information with Cloud power and flexibility for all the intriguing investigations you’re now working out how to tackle. Mark suggested I might say some things for Darwin at this gathering, and I felt at once that yes, if Darwin could be with us for this opening today, he’d be fascinated and impressed. No – excited, I realised. He’d see very clearly the opportunities CLIMB presents, and he’d sense the spirit of the whole venture. He’d be eager to hear about every plan and delighted to talk. I thought that if it might mean anything for you today to hear what I can say about this point, I’d be glad to tell you here and now, for Darwin’s sake as well as yours. And why that? Because this venture is one of so many continuations of what he started with The Origin of Species, and I feel that all who join in the great effort in ways like yours are joined with him and each other in it.

Hearing about CLIMB – the technical opportunity, the resources you have to seize it with, and all the lines of research you’re planning to use it for, I think of the chain of scientists whose ideas you developing – Darwin as founding father and then Robert Koch, Ferdinand Cohn and Carl Woese among others since, each with his step forward. Each of them had some luck. He made it his opportunity and realised its potential. You and all your colleagues in the CLIMB collaboration have yours with CLIMB today.

Cloud Infrastructure for Microbial Bioinformatics. Wonderful words for an onlooker like me to understand, as they fit together so clearly and tightly for your purpose. Let me now touch on some points that I feel link Darwin with you.

The Evolutionary Factor

Mark Pallen took as his text for his Inaugural Lecture last year, Dobzhansky’s insistence that “Nothing in biology makes sense except in the light of evolution.” Looking then more closely at his Department’s special concerns, Mark showed how true that is for microbiology as a specialism. Looking yet closer today at CLIMB, he’s made it clear to me how central Dobzhansky’s point is for so much of microbial bioinformatics. One key point, it seems to me, is the speed of reproduction in microbiology with the extent of variation and selection in the process, and the significance of the changes for other organisms. It’s a truism how difficult we find it to observe evolution of macroorganisms; there are the few well-known examples, but for all other species it’s like ‘grandmother’s footsteps’ with the stillness whenever you look behind you for any movement. But microbial genomics can be a helter-skelter ride. In this area, excitingly, the often bucking process of change is a key concern, a central issue that we need to understand.

The Tree of Life

Darwin wrote in The Origin of Species that relationships between all species of the same kinds “have sometimes been represented by a great tree.” The ‘Tree of Life’ no less. He then wrote carefully, “I believe this simile largely speaks the truth. The green and budding twigs may represent existing species; and those produced during each former year may represent the long succession of extinct species. [In this way] the great Tree of Life … fills with its dead and broken branches the crust of the earth, and covers the surface [above] with its ever-branching and beautiful ramifications.” It’s important to recognize here that the essential point for Darwin when he wrote “I think” in his secret notebook and drew his first branching diagram, and for CLIMB now, was and is surely not the huge trunk of the great tree but quite simply those “ever-branching and beautiful ramifications”; Darwin’s “endless forms, most beautiful and most wonderful, [which] have been, and are being evolved”, as he ended his final sentence of The Origin of Species. “Are being evolved.” With those last three words Darwin placed his final emphasis on the ever-continuing process that is central to what CLIMB will be all about.

Big Data

Four petabytes of data storage and 78 terabytes of total RAM. The need for such remarkable capacity and power follows directly from the understanding of evolution that has stemmed from Darwin’s writings with its randomness and variation, its endless proliferations and the whole global diversity of life in which microorganisms indeed excel over macroorganisms. Hearing your talk of ‘big data’ I think at once of the number of data mountains Darwin had to climb through his working life to gain an adequate understanding of the topics he was having to tackle over the range of life on earth to make sense of the factors involved in their global dimension. Especially, he would have said, the eight years he had to spend dissecting barnacles in order to prove himself to be a competent taxonomist, so that he’d be able to write as he wanted to, on species around the world and through geological time, and gain any serious attention for his views on what he found to explain their relations. For the time he took, we should remember his poor son who when visiting a friend at the age of ten was shown around the house and asked ‘Where does your father do his barnacles?’ From Darwin’s understanding of the endless variety of natural life and the infinite complexity of all the interactions it involves, he would appreciate at once why CLIMB is focussing so sharply on the scale of data storage and processing capacity it’s able to provide.

Value for Medicine

When Mark Pallen first explained the project to me, I saw at once its great interest for pure science, but when he explained about some of the investigations it is to be used for and mentioned the MRC funding, only then did I really take in its significance for medicine. Picking up on its potential for work on hospital infections and Antimicrobial Resistance, I remembered at once a point about Darwin. Not an achievement of his but his reaction to one of another person, how quickly he recognised its value for medical treatment and how strongly he felt about that value.

I have to sketch in some background. When Darwin was working on The Origin of Species, his first daughter, Annie, then ten years old, fell ill, probably with TB. He was devoted to her; he did all he could to save her life, caring for her night and day in her last illness; he was devastated by her loss, and he was deeply shaken by the doctors’ inability to identify, understand and treat her illness. Twenty five years later, in 1877 as Louis Pasteur was making the case for the germ theory of infection, a close scientific friend of Darwin’s, Professor Ferdinand Cohn, then a plant physiologist but later to become the founding figure of microbial taxonomy, sent him a copy of his journal for plant science. The issue contained the first photographs ever published of bacteria. They had been taken by Robert Koch who was to identify the TB bacillus five years later. Dr Koch had come to Cohn with his photographs of his first microscopic preparations of the anthrax bacillus, and had shown him his paper arguing for the first time that these bacilli were the cause of the disease. Professor Cohn recognised at once the great importance of his findings for medicine and the saving of life, and wrote to Darwin that Koch’s photographs showed “the least but also perhaps the mightiest living beings”. Darwin replied to him, “I well remember saying to myself between twenty and thirty years ago, that if ever the origin of any infectious disease could be proved, it would be the greatest triumph to Science; now I rejoice to have seen the triumph.” That in those words was what Koch’s achievement meant to Darwin, the scientist and the father.

Information Management

I was fascinated to see Tom Connor’s explanation on Youtube of the CLIMB project, with his picture of the sequencing iceberg and his breakdown of the budget for different parts of the project. 75% for the expertise in the informatics, a critical need for the whole venture. It is fascinating to see how CLIMB users will be using together with their quantities of genomic data such quantities also of data of other very different kinds from very different sources, clinical, diagnostic, and then also population and epidemiological.

I have no scientific experience but have worked in the public sector on some matters needing careful and effective management of ranges of different kinds of information together, with gaps and inconsistencies in and between different datasets often compounding the difficulties of drawing any sound conclusions. So often, critical needs for information management just weren’t recognised by the managers and weren’t provided for. It seems to me that many people just don’t see these kinds of problems and their consequences, because they feel that information is just information and doesn’t need any managing to achieve completeness, accuracy, consistency, availability - and so meaning. People are often perfectly good with their own data simply because they know it well, but are then casual and careless about other peoples’, and when they use others’ in combination with their own, they don’t see the potholes until they realise they are stuck in one. With the range of data you’ll be using for all your range of aims, all the care you’re taking with the discipline of informatics will be invaluable for success.

Suggestions from Darwin

With all that lies ahead for you all in your work with CLIMB on today’s scientific and medical challenges, I’d like to offer from Darwin’s experience two suggestions on how to move forward. The first is an early comment of his when as a young man he was first glimpsing the power of the ideas he was fitting together on ‘descent with variation’ and ‘natural selection’, and the second is the last comment he made on research like yours before he died.

For the first suggestion, shortly after Darwin drew his first iconic branching diagram in his secret notebook, he spotted the extraordinary implication for all humans and animals and went on, “If we choose to let conjecture run wild, then animals, our fellow brethren in pain, disease, death and suffering, our slaves in the most laborious works, our companions in our amusements – they may partake from our origin in one common ancestor, we may be all netted together.” Just notice how he started that comment. “If we choose to let conjecture run wild …” Yes, with the fresh information and ideas you’ll be developing with CLIMB, choose to do just that, dare to! Bold conjectures may succeed powerfully.

Darwin’s last comment on research of this kind appeared in a preface he wrote for a work by a brilliant young friend on plants’ remarkable adaptations for cross-fertilisation by their pollinators. He spun out a series of ideas he’d found in the book for further investigations he’d love to pursue, and then, knowing privately that he was dying and wouldn’t be able to take any of them up, he continued – “But it would be superfluous to make any further suggestions. These will occur in abundance to any young and ardent observer who will study this work and then observe for himself, giving full play to his imagination, but rigidly checking it by testing each notion experimentally. If he will act in this manner, he will, if I may judge by my own experience, receive … much pleasure from his work.

CLIMB now offers a wealth of fresh opportunities for research just like those that Darwin could then see. Opportunities for “young and ardent observers”, if they will “observe for themselves, give full play to their imagination, but rigidly checking it by testing each notion experimentally”. And we here today can add “analytically” with all CLIMB’s processing powers.

Then Randal symbolically opened the champagne

Randal and champagne

and we had a brief interlude before Jon Chase began his science rap session.

jc

You can access the video of Jon's performance here:


A full set of photos of the event can be accessed here: https://www.dropbox.com/sh/ky3ck6wsij4mx4b/AABxsWEoVlxklGQ2NhjegzgVa?dl=0

And to close this blog post, how about this classic pose of me and Jon! Cool, no?

MP and JC

And a big thank you to all who worked to make this such a special event!!


March 13, 2015

The story behind the paper: Sedimentary DNA from a submerged site reveals wheat in the British Isles

Writing about web page http://www.sciencemag.org/content/347/6225/998

Late last month, I was proud to be joint last author on a paper in Science on the presence of wheat in the British Isles 8000 years ago. But how does a medical microbiologist come to be involved in a study on the intricacies of the Neolithic transition?

Well, like many of life’s greatest ventures, it all began in a bar…

I have to admit to a weakness for rounding the week off by a Friday evening trip to the bar. This started when I worked in Barts in 1980s and 1990s, where the Robin Brook Centre bar hosted many a lively conversation (and acted as a link to various melodramas, including an alleged murder, hostage taking and a police shoot-out: but that’s another story).

When I arrived at the University of Birmingham in 2001, I was delighted to discover the delights of the Bratby Bar, nestled within the university’s Staff House. During more than a decade of visits, I had the chance to chat to all sorts of people from across the University, from Pro-Vice-Chancellors to post-docs. Fortuitously, John Heath (formerly Head of Biosciences, latterly Birmingham’s PVC for Estates) introduced me to Vince Gaffney, a garrulous landscape archaeologist from Geordieland (below).

Vince Gaffney

Having recently set up a next-generation sequencing service and also having picked up on the excitement of ancient DNA research, at intervals I suggested to Vince that he should let us have some archaeological material to play with, to see if we could get any sequences out of it. Imagining we could tread in the footsteps of Schliemann or Carter, I had in mind something glamorous like a mummified hand or a skeleton from a ritual burial. Instead, we ended up with some mud! But mud of a highly precious and productive sort.

Vince was interested in understanding how the Neolithic transition (the spread of farming after the domestication of plants and animals) arrived in northwest Europe. The arrival of farming in this part of the world coincided with rising sea levels following the end of the last Ice Age. Vince had a track record in studying the landscapes that were inundated during this time and he was convinced the earliest clues to the arrival of the Neolithic in this part of the world would be found in these now-submerged sites.

Vince pointed me in the direction of some pioneering studies on sedimentary ancient DNA, which had established that DNA from macroscopic plants and animals could be detected in sediments even in the absence of macrofossils and could be used to reconstruct past environments. Two studies in particular stood out: one on the Viking settlements in Greenland and the other on the detection of sheep and moa DNA from outside a cave in New Zealand. It struck me that this was an exciting emerging field, fertile with opportunity.

Vince suggested that we try to detect signs of Neolithisation by searching submerged sediments for DNA from domesticated species that had no natural relatives in North Western Europe. That ruled out cows (wild relative: the aurochs) and pigs (related to wild boar), but made sheep and goats an attractive target. I pointed out to Vince that although we had the wherewithal to do the high-throughput sequencing and bioinformatics, it would be a rather fraught process trying to devise and implement protocols for target-specific amplification of ancient DNA. Instead, buoyed up by recent success with metagenomics on human faecal samples, I suggested that we try simple shotgun metagenomics—in other words we just extract DNA from the sediment cores and sequence it directly without any attempt at target-specific amplification or capture.

And then a period of turbulence descended on our academic lives…

I was headhunted and recruited to a new position at the University of Warwick in April 2013, while Vince was preparing to leave the University of Birmingham and eventually ended up at the University of Bradford. This could have signalled the end of the proposed research, but Vince and I were determined to continue with the work.

In fact, as luck would have it, my move to Warwick breathed new life into the project, as I hooked up with Robin Allaby from Warwick’s School of Life Sciences. Robin, seen here in the guise of a modern-day Jesus of the barley field, not only had a track record in the evolution of domesticated species, particularly plants, but had also established a dedicated ancient DNA laboratory at Warwick, ideal for performing DNA extractions from sediment cores.

Robin Allaby

I quickly persuaded Robin of the merits of the project and, as I was preoccupied with establishing a new Division of Microbiology and Infection, passed over to him day-to-day supervision of the work. Fortunately, Robin was able to recruit his recently graduated PhD student, Oliver Smith to the study. Oliver was an ideal candidate in having experience with ancient DNA studies, while also being between projects. Funding for the work came from my start-up package from Warwick Medical School, which paid for a sequencing instrument (an Illumina MiSeq), sequencing reagents and a salary for Oliver for nine-ten months.

By the middle of 2013, Vince had tracked down the perfect samples for the project—some 8000-year-old submerged sediment cores that had been collected from the Solent by an maritime archaeologist Gary Momber. Oliver extracted DNA from four samples of sediment in the ancient DNA lab and then sequenced them on our MiSeq. He and Robin then analysed the metagenomic sequences. Robin soon recognised that naïve use of existing metagenomics analysis pipelines was likely to turn up spurious results because of biases in what was represented in the databases (see recent Ed Yong's blog post on “discovery” of platypus DNA in Virginia and plague on New York subway), so he devised an improved method that avoided the problem.

Contrary to our initial hopes, Robin and Oliver did not discover any sheep or goat DNA. Instead, they discovered sequences from wheat, a domesticated plant that originated in the Middle East, with no close wild relatives in Northern Europe. This represented a triumph for metagenomics in an ancient DNA research, confirming two advantages of this approach over target-specific assays:

  1. It is open-ended, not just targeting what you expect to find, but also revealing the unsuspected.
  2. It is probably more sensitive than target-based amplification in garnering relevant information from billions of base pairs of unamplified DNA rather than amplified copies of just a few hundred base pairs of a sequence barcode.

After that, Robin played a key role in co-ordinating the writing and submission of a manuscript, carefully steering our paper through the reviewing and editorial process. And so, finally, we ended up with every academic’s dream-come-true—a paper in Science magazine!

Of course, my account of things here is heavily biased towards the role of sequencing and bioinformatics in this project. It is also important to recognise the key role played by our archaeological collaborators in framing the right questions, gathering the right samples, performing the palaeo-environmental analyses and providing the relevant contextual interpretation of the findings.

And this success brings a new challenge: what on earth is Warwick Medical School going to do with this high-impact paper in Science for REF2020, as I cannot see it flying with the clinical medicine Unit of Assessment! But we have five years to work on that problem!

Let me close by raising a figurative glass to toast the role of Birmingham’s Staff House Bar in all this! A note to all PVCs for Estates: shouldn’t all universities be investing in similar drinking establishments to catalyse new projects and facilitate collegiality? And a note to the relevant promotion panel in Warwick: shouldn’t it soon be Professor Robin Allaby. I’ll drink to both points!

Pallen and Allaby

Robin and I celebrating success at the top of the Shard.

The paper: http://www.sciencemag.org/content/347/6225/998

Commentary on the Paper in Science: http://www.sciencemag.org/content/347/6225/945.summary

Press release: http://www2.warwick.ac.uk/newsandevents/pressreleases/dna_evidence_shows/


September 23, 2014

Sequence the sputum: using metagenomics to diagnose tuberculosis

Writing about web page https://peerj.com/articles/585/

Laboratory diagnosis of tuberculosis (TB) using conventional approaches is a long drawn-out process, which takes weeks or months—plus, relying on laboratory culture means using techniques that date back to the 1880s!

In a report published today in the peer-reviewed journal PeerJ, we describe a new approach to the diagnosis of TB that relies on metagenomics—that is direct sequencing of DNA extracted from sputum—to detect and characterize the bacteria that cause TB without the need for time-consuming culture in the laboratory. Using the latest high-throughput sequencing technologies and some smart bioinformatics, we can now obtain sequences from the bacteria that cause TB in just a few days straight from clinical samples and gain insights into their genome sequences and the lineages they belong to, all without having to culture cells or capture or amplify DNA.

In this study, first-year PhD student Emma Doughty (https://twitter.com/EmmaDoughty6) and bioinformatician Dr Martin Sergeant, both working at Warwick Medical School, have worked with African scientists Dr Martin Antonio and Dr Ifedayo Adetifaworking at the MRC Unit in The Gambia to develop and exploit novel sequencing and analytic approaches. They detected sequences from the TB bacteria in all eight sputum samples they investigated and were able to assign the bacteria to a known lineage in seven of the samples. Two samples were found to contain sequences from Mycobacterium africanum, a variety of the TB bacterium that is particular to West Africa.

This is part of a connected programme of research in the Pallen group, where we have been using metagenomics to detect bacterial pathogens in contemporary and historical human material. Last year, we used metagenomics to obtain an outbreak strain genome from stool samples from an E. coli outbreak and to recover TB genomes from ~200-year-old Hungarian mummies. Earlier this year, we recovered the genome of Brucella melitensis, which causes an infection called brucellosis in livestock and humans, from a 700-year-old skeleton from Sardinia, Italy.

We now aim to work on a larger number of sputum samples, perhaps looking at a hundred consecutive samples in the fullness of time. But, before then, we need to spend a bit more time optimising our DNA extraction protocols. We were pleasantly surprised that the protocol we used worked “out of the box”, but we are confident that we can improve things so we get fewer human DNA sequences and more mycobacterial sequences from each sample. If we can increase coverage of the TB genomes, we may soon be able to detect mutations associated with drug-resistance directly from the sputum.

The final goal, shimmering on the horizon, is that we might one day be able to extract information from all the macromolecules in a sample (DNA, RNA, proteins) so that we get a read-out of what pathogens are there, what virulence or resistance genes are being expressed, what host responses are switched on and also maybe detect cancerous or pre-cancerous changes in the patient’s genome. This is probably going to rely on a new kind of approach: nanopore sequencing—to learn more about this, watch the recent Bioinformatics and Balti session on YouTube. The future is looking very exciting!

PS: we have been very impressed with the service offered by PeerJ, with just two weeks from submission to acceptance!

Professor Mark Pallen, Professor of Microbial Genomics and Head of the Microbiology and Infection Unit,

Warwick Medical School

http://www2.warwick.ac.uk/fac/med/research/microinfect/staff/pallen/
http://twitter.com/mjpallen
https://www.youtube.com/user/pallenm
http://scholar.google.co.uk/citations?hl=en&user=gg8yViYAAAAJ


July 16, 2014

Recovery of medieval Brucella genome by metagenomics

Writing about web page http://mbio.asm.org/content/5/4/e01337-14

Diagnosing a 700-Year-Old Infection

Last summer, Warwick Professor of microbial genomics Mark Pallenand colleagues described recovering tuberculosis genomes from the lung tissue of a 215-year-old mummy from Hungary in the New England Journal of Medicine.Soon afterwards, news of his interest in metagenomic analyses on historical samples spread, and materials started to flow in.

Italian anthropologist Raffaella Bianucci asked Pallen if he would look for pathogens in archaeological samples from Belgium and Sardinia, an island off the coast of Italy, and he agreed. The relationship led to recovering a genome of the bacterium Brucella melitensis from a 700-year-old skeleton found in the ruins of a Medieval Italian village.

Reporting this week in mBio®, the authors describe using a technique called shotgun metagenomics to sequence DNA from a calcified nodule in the pelvic region of a middle-aged male skeleton excavated from the Sardinian settlement of Geridu, thought to have been abandoned in the late 14th century. Shotgun metagenomics allows scientists to sequence DNA without looking for a specific target.

Brucella pics
Skeleton and calcium deposits -- courtesy Mark Pallen, Warwick Medical School

From this sample, the researchers recovered the genome of Brucella melitensis, which causes an infection called brucellosis in livestock and humans. In humans, brucellosis is usually acquired by ingesting unpasteurized dairy products or from direct contact with infected animals. Symptoms include fevers, arthritis and swelling of the heart and liver. The disease is still found in the Mediterranean region.

“Normally when you think of calcified material in human or animal remains you think about tuberculosis, because that’s the most common infection that leads to calcification,” Pallen says. “We were a bit surprised to get Brucella instead.”

The skeleton contained 32 hardened nodules the size of a penny in the pelvic area, though Pallen says it’s unclear if they originated in the pelvis, or higher up in the chest or other body part. The team took care to sample the interior of a nodule, to eliminate the risk of contamination from soil.

In additional experiments, the research team showed that the DNA fragments extracted had the appearance of aged DNA – they were shorter than contemporary strands, with only 100 base pairs, and had characteristic G-A or C-T mutations at the ends. They also found that the medieval Brucella strain, which they called Geridu-1, was closely related to a recent Brucellastrain called Ether, identified in Italy in 1961, and two other Italian strains identified in 2006 and 2007. They confirmed their findings by comparing the distribution of genetic insertions and deletions located in Geridu-1 with those found in other Brucella strains.

The study “confirms that whole-genome sequences from bacterial pathogens can be recovered from human remains by metagenomics hundreds or even thousands of years postmortem,” Pallen says.

Brucella
Brucella melitensis -- credit: CDC

Pallen’s team is now testing shotgun metagenomics on a range of additional samples, including historical material from Hungarian mummies; Egyptian mummies; a Korean mummy from the 16th or 17th century; and lung tissue from a French queen from the Merovingian dynasty, which ruled France from the 5th to 8th centuries; as well as contemporary sputum samples from the Gambia in Africa.

“Metagenomics stands ready to document past and present infections, shedding light on the emergence, evolution and spread of microbial pathogens,” Pallen says. “We’re cranking through all of these samples and we’re hopeful that we’re going to find new things.”

-- Karen Blum, science journalist writing for Mbiosphere



Original blog posting here: http://mbioblog.asm.org/mbiosphere/2014/07/diagnosing-a-700-year-old-infection.html




April 15, 2014

Videos and photos from the Pallen Inaugural

Follow-up to Nothing in Microbiology makes Sense except in the Light of Evolution from The Microbial Underground

Live streamed version:




Slidecast version (better sound quality and no need to look at my ugly mug!)


Photos from the day here:

https://www.dropbox.com/sh/ao2xx91qrjl8rtu/2FijtEacI9


Nothing in Microbiology makes Sense except in the Light of Evolution

Writing about web page https://storify.com/mjpallen/palleninaugural

Here is online companion to my Inaugural Lecture.



April 14, 2014

False positives complicate ancient pathogen identifications, but only if you are naive and arrogant

Writing about web page http://www.biomedcentral.com/1756-0500/7/111

I came across this piece published in BMC Research Notes a few weeks ago, but have only just found time to comment on it: http://www.biomedcentral.com/1756-0500/7/111

  • False positives complicate ancient pathogen identifications using high-throughput shotgun sequencing BMC Research Notes 2014, 7:111 doi:10.1186/1756-0500-7-111 Michael G Campana (mcampana63@gmail.com) Nelly Robles García (nellym_robles@yahoo.com.mx) Frank J Rühli (frank.ruhli@anatom.uzh.ch) Noreen Tuross (tuross@fas.harvard.edu)

I cannot say that I am too happy with the style of the comments therein on our recent publication of metagenomic recovery of a TB genome from mummified remains (http://www.nejm.org/doi/full/10.1056/NEJMc1302295):

Additionally, a recent study by Chan and colleagues [54] claiming the identification of multiple strains of pathogenic tuberculosis (Mycobacterium tuberculosis) through non- targeted metagenomic sequencing has demonstrated insufficient analytical rigor to support their conclusions. The authors aligned their sequences against a single strain of pathogenic tuberculosis, but did not account for misalignments or environmental contamination with ubiquitous soil mycobacteria. Chan and colleagues’ data merit reanalysis with appropriate environmental controls. We recommend that the authors of these three studies demonstrate the veracity of their findings using a targeted capture approach and further bioinformatic analysis.

I guess working in Harvard makes people prone to academic arrogance! Perhaps there is also a whiff of sour grapes: they couldn't find any pathogens in their samples by metagenomics so we can't have done too! But dealing with the substance of the comments is easy enough. And ironically, I agree entirely with their earlier comments that these two papers are highly suspect:

http://www.ncbi.nlm.nih.gov/pubmed/23553074,21765907

(NB they are misreferenced in this paper).

OK, so let's take their points on our study one by one...

  • The authors aligned their sequences against a single strain of pathogenic tuberculosis, but did not account for misalignments or environmental contamination with ubiquitous soil mycobacteria.
  • Mycobacterium tuberculosis is a genetically monomorphic species, so there is not much to be gained by aligning against multiple strain genomes. But we did also compare the SNP profiles of our genomes with the recent close relative 7199/99. In the standard filtering that we employed, SNPS with low/high coverage and low mapping scores were removed, thus avoiding problems with repetitive DNA. The fact the majority of the mixed SNPs matched those of 7199/99 and H37Rv confirms that they are real. Plus, we did discuss the presence of environmental Actinobacteria in the metagenome in the Supplementary Material, where we report the presence of a Nocardia sp at around 200X coverage and of a relative of Thermobifidia fusca at around 10X coverage. We binned contigs according to Z score and coverage to avoid mixing up reads from different species. And we obtained deep and even coverage of the M. tuberculosis genome, which cannot be accounted for by misinterpretation of matches to environmental species. We have seen such spurious matches in some analyses, but they appear only when a low-stringency approach is applied to mapping and are obvious because they show spikey coverage limited to conserved regions (e.g. rRNA genes) rather than across the whole genome.
  • Chan and colleagues’ data merit reanalysis with appropriate environmental controls.
  • And what might these controls be? We have analysed a piece of lung tissue from mummified remains from a casket rather than the soil. As detailed in previous papers (http://www.ncbi.nlm.nih.gov/pubmed/?term=12576588+12541332+18399990), rigorous efforts were taken to avoid contamination during sampling and storage. We have never grown M. tuberculosis or sequenced TB genomes in the lab. Where else could the M. tuberculosis DNA have come from other than the sampled individual?
  • We recommend that the authors of these three studies demonstrate the veracity of their findings using a targeted capture approach and further bioinformatic analysis.
  • There is some faulty logic here. We are indeed contemplating using a capture-based approach to increase the sensitivity of our analyses, but this will do nothing for the speciificity of the approach, since any contaminating sequences which map to the pathogenic reference strains in silico are likely to be captured in vitro anyway because of their similarity to the bait. The answer is instead to increase the stringency of mapping and look for a consilience of results from multiple sources of evidence (e.g. evenness of coverage, SNPs that allow an assignment within an established clade), which we have done.

We are continuing to perform metagenomics on mummified material from Vác and on other historical samples and will be publishing additional studies in due course. This is an exciting area of research and one does has to be careful in interpretation, but our findings stand firm. Anyone who wants to repeat the analyses we reported in Chan et al is welcome to do so. The reads are available here:http://www.ncbi.nlm.nih.gov/sra?LinkName=pubmed_sra&from_uid=23863071

But I am afraid I agree with Campana et al when they critiicise the other two papers, Thèves et al because, inter alia, you cannot tell Shigella from E. coli by 16S and Khairat et al, because no sequence data is available in the public domain. Caveat lector! But enjoy the excitement of progress in this field (see also http://www.ncbi.nlm.nih.gov/pubmed/?term=24708363+23765279)


February 19, 2014

The Origin of Research Projects, with hat tip to Darwin

While mulling over the process of submitting research proposals, I came up with this:

"More individuals are born than can possibly survive. A grain in the balance will determine which individual shall live and which shall die,—which variety or species shall increase in number, and which shall decrease, or finally become extinct. As the individuals of the same species come in all respects into the closest competition with each other, the struggle will generally be most severe between them; it will be almost equally severe between the varieties of the same species, and next in severity between the species of the same genus. But the struggle will often be very severe between beings most remote in the scale of nature. The slightest advantage in one being, at any age or during any season, over those with which it comes into competition, or better adaptation in however slight a degree to the surrounding physical conditions, will turn the balance.”

Origin of Species, 1859, Charles Darwin

"More proposals are submitted than can possibly be funded. A grain in the balance will determine which proposal shall be funded and which shall die,—which research group or institution shall increase in number, and which shall decrease, or finally become extinct. As the individuals in the same research area come in all respects into the closest competition with each other, the struggle will generally be most severe between them; it will be almost equally severe between researchers in the same sub-discipline, and next in severity between sub-disciplines of the same discipline. But the struggle will often be very severe between proposals most remote in the type of research. The slightest advantage in one proposal, at any stage or during any meeting, over those with which it comes into competition, or better adaptation in however slight a degree to the surrounding political conditions, will turn the balance.”

Origin of Research Projects, 2014, Mark Pallen


February 07, 2014

Bugs that may not only kill bugs

Unlike most normal well-adjusted folk my childhood obsession with "bugs" never really went away. Now I find myself still playing with insects for what on the surface appear to be very different reasons, but underneath are still the same…basic curiosity. This soon evolved into an interest in what diseases (another type of “bug”) they can suffer from. The more research I did into this the more I came to understand that they are no so different from humans. They have very similar immune systems and can suffer very similar diseases. In fact it wasn’t long before I began to question if we could catch the same diseases as them?

picture 1

I am not talking here about insects acting as vectors for human diseases, I mean can bacterial pathogens cause equivalent diseases in both insects and people?

I am particularly interested in a common insect pathogen, which goes by the name of Photorhabdus. This bacterium, which is closely related to plague bacteria (Yersinia pestis), is carried around in the soil by a symbiont nematode worm calledHeterorhabditis. This worm burrows into insects and then regurgitates the bacteria, which then rapidly kill the insect. The bacteria produce a large number of drug-like molecules, which prevent any other organisms from infecting the insect, and also sacrifice themselves as a food source for the reproducing worms. The Photorhabduscan control the development of the nematode (somehow) and re-associate with it before they leave the cadaver in search of new insect prey.

This fascinating life cycle provides many interesting aspects to study, such as symbiosis, drug-discovery, immune evasion and toxin biology. Not to mention they glow as the only terrestrial bioluminescent bacterium. Who could not be interested in a glowing disease?

picture 2picture 3

In fact they are so good at killing insects that humans have used them for decades as a biological pest control agent to protect our orchards, greenhouses and golf courses. What has particularly attracted my attention however is that certain strains of this ubiquitous insect pathogen have been found causing a very unpleasant disease in humans!

picture 4picture 5

The Photorhabdus genus contains three distinct species. Two of which, P. luminescensand P. temperatacan only infect insects while the third, P. asymbiotica can also infect people. Genetically there is very little difference in these three species suggesting that the “jump” into humans required very little genetic change.

picture 6

Our recent studies into how P. asymbiotica can pull off the trick of causing disease in insects and humans has given us some surprising results. The main trick appears to be that they have evolved the ability to tolerate growth at 37°C, which kills the insect-restricted strains.

However, at 37°C P. asymbioticaloses the ability to use many of the nutrients that it can otherwise use at the lower, insect host temperature. We may naively assume that this would make it a “weaker” pathogen. However there are several examples of other bacterial pathogens of humans that appear to have undergone significant “loss of functions” when compared to their close relatives. On the other hand Photorhabdusappears to up-regulate many of the same toxins it would use against insect hosts when exposed to the human body temperature of 37°C. Same tools, different job?

As pathogens of insects far outnumber those of humans, they provide a massive reservoir of potential future human disease causing agents. Depending on how minor the genetic changes have been that have allowedP. asymbiotica to develop this “split personality” will determine how seriously we should take this threat.


January 31, 2014

The ‘Other’ side

I have just recently moved back to the UK – from beautiful, warm Italy. Well, personally it was a much bigger move- from Industry back to Academia.

Several people have been curious about my experience (while some just think I am plain crazy to give up a Company position), so I thought I would share some of the things I learnt from the ‘other’ side in this blog!

When I first started, I still remember being amazed by the amount of careful planning and effort that goes into getting something from a research lab to a product. The attention to details, the endless number of tests and approvals, and the complexity of the entire process is overwhelming. The large teams of experts, all believing and working towards the same goal, is really quite something.

Setting objectives, having real deadlines, were all foreign to me- but I did not really dislike this part. In fact I felt that these actually make you use your time more carefully. Drawing out personal objectives and reviewing them regularly is actually quite productive. Stopping projects that are going nowhere is a sad but good thing, and the sooner the better.

I think all these ‘company’ routines could actually help an academic, as we quite often tend to lose track of things! I think it would certainly not hurt to say – lets try this for X months, if it does not work by then we move on to something different. Indeed, most times we get so stuck onto the idea that we just cannot give it up!

The big downside of working for a company: you lose your intellectual freedom. You could not just call your good friend, the expert, and discuss your work or ask him for a reagent. You also lose the flexibility and freedom that academic jobs offer. You could not try that cool experiment you thought of in the middle of the night. You would also lose your ‘individuality’ to some extent, as you are usually a spokesperson for the company.

So if you like being part of a team focused on creating a product that will actually reach real people, industry is perfect. You could still do good science, and of course, not worry about funding.

However, I do think it is a great idea for academic scientists who would like their research to be ‘useful’ to spend some time working in the commercial sector. Although there is a much better industry-academia crosstalk now than a decade ago, I think there are a lot of gaps that are not very evident from the outside, which could be better filled in by academicians.


Search this blog

Blog archive

Loading…

Most recent comments

  • Hi Shilp, glad that you found it useful . I used seqtk sample. So if I had 100 reads for 90% seqtk s… by Andrew Millard on this entry
  • Dear Andrew, Fantastic post..! very informative.. I would like to know how did you down–sample your … by Shilp Purohit on this entry
RSS2.0 Atom
Not signed in
Sign in

Powered by BlogBuilder
© MMXV