NERSC Supercomputers Help Researchers Create Reference Catalog for Rumen Microbiome
Cultivation and sequencing effort targets economically and environmentally relevant microbes
March 19, 2018
By David Gilbert
Using supercomputers at the National Energy Research Scientific Computing Center (NERSC), an international team led by William (Bill) Kelly, formerly at AgResearch New Zealand’s Grasslands Research Centre, and including scientists at the U.S. Department of Energy (DOE) Joint Genome Institute (JGI), a DOE Office of Science User Facility, generated a reference catalog of rumen microbial genomes and isolates cultivated and sequenced from the Hungate1000 collection. One of the largest targeted cultivation and sequencing projects to date, the collection was produced through the coordinated efforts of rumen microbiology researchers worldwide.
At the beginning of the project, there were only reference genomes for 14 bacteria and one methanogen. The Hungate catalog now contains a total of 501 genomes—410 newly generated from this study, plus an additional 91 already publicly available from other studies. A paper highlighting this work was published today in Nature Biotechnology.
The digestive tracts of ruminant (cud-chewing) animals such as cattle, sheep, and goats convert lignocellulosic plant matter to short-chain fatty acids used for nourishment with unparalleled efficiency, thanks to the activity of symbiotic microbes in the rumen. Rumen microbes play a vital role in allowing ruminant livestock to break down the food they eat, and produce milk, meat and wool which help support the livelihoods and food security of over a billion people worldwide. The process, however, is also the single largest human-influenced source of the greenhouse gas methane (CH4), with these animals releasing approximately 138 million U.S. short tons of CH4 into the atmosphere each year.
Understanding the diversity and function of the rumen microbiome is a critical step toward developing technologies and practices that support efficient global food production from ruminants while mitigating methane emissions. Additionally, there is considerable interest in identifying biotechnologically relevant enzymes for the conversion of plant feedstocks to biofuel and bioproducts.
Scaling Rumen Microbiology Science
“JGI is a world leader in conducting, enabling, and democratizing sequence based research—and one of the few places that does science at this large scale. Beyond the sequence generation, data processing, and big compute resources, we bring significant experience and expertise to help bridge the gap from sequence to biology,” said Rekha Seshadri, JGI computational biologist and co-first author of the paper.
The project was named in honor of the late Robert Hungate (1906-2004), a pioneering microbiologist who invented the widely-used method of cultivating strictly anaerobic bacteria that now bears his name.
The work, noted Kelly, is the culmination of a JGI Community Science Program (CSP) 2012 proposal that originated at a meeting of rumen microbiologists held in New Zealand in February 2011. Former JGI Director Eddy Rubin was one of the attendees at the meeting as the JGI had just published the first rumen metagenome study in Science, and his perspective on the scale of sequencing that was possible encouraged the development of an ambitious project.
Kelly said, “Our work to generate the Hungate genome catalogue provides the link between the classical microbiology that provided the cultivation basis for anaerobic (especially rumen) microbiology and the modern techniques that are independent of microbial cultivation. The combination of DNA and RNA sequence analysis and laboratory experiments with genome sequenced, characterized, cultivated strains can now be used to begin to reveal the intricacies of how the rumen microbial community functions.”
“This project is a great example of how science can progress rapidly if we do things together,” added study co-first author Sinead Leahy, now with the New Zealand Agricultural Greenhouse Gas Research Centre. “The project helps further fill the gap in our knowledge of anaerobic microorganisms as it specifically targets the specialized anaerobic rumen environment and reports genome sequences from strains that have yet to be taxonomically assigned or phenotypically characterized but which are among the most abundant rumen micro-organisms.”
Science Highlights
The Hungate catalog encompasses 75 percent of genus-level taxa reported from the rumen. The researchers were able to assign individual microbes to the major metabolic pathways involved in rumen function. They reported that in total, the catalog of genomes encode nearly 33,000 degradative Carbohydrate-Active Enzymes which can break down plant cell walls. Other metabolic highlights and evolutionary vignettes of the rumen microbiome are presented in the manuscript. They noted an interesting instance of evolution by gene loss of the universally conserved enolase, the penultimate enzyme in glycolysis, the metabolic pathway that converts glucose to pyruvate. Rumen-specific adaptations such as de novo synthesis of B12 and potential vertical inheritance of the rumen microbiome are discussed.
To test the value of the Hungate collection as a resource that underpins metagenomic analysis, 1.4 million coding sequences (CDS) from the reference genomes were searched against ~1.9 billion CDS from over 8,000 varied metagenomic samples, stored in the Integrated Microbial Genomes & Microbiomes (IMG/M) database. The IMG/M system supports annotation, analysis, and distribution of microbial genomes and microbiomes.
The majority of Hungate genomes were indeed present in available rumen metagenomes. “However, there was significant overlap with the human microbiome – almost a third of the species were detected in human digestive system samples, inadvertently increasing the reference set for the study of the human microbiome as well. IMG is a comprehensive resource of sequence data integrated with environmental metadata (which is key), without which, these observations would not have been made,” said Seshadri. The importance of integrating microbiome data across all habitat types to enable novel correlations and discovery, is one of the main pillars of the recently proposed National Microbiome Data Collaborative (NMDC) in which JGI is playing a leadership role.
A Resource for Rumen Microbiology
“This publication marks a significant contribution to rumen microbiology and the genome sequences, coupled with their corresponding cultures, will make the Hungate Collection an outstanding resource for rumen microbiology groups around the world to link microbial genomes to rumen function and shedding light on what has been described as the world’s largest commercial fermentation process,” said Kelly, Leahy and Graeme Attwood of AgResearch in a joint statement.
The Hungate Collection was conceived as a community resource. Access to bacterial cultures can be requested from AgResearch New Zealand. All available genomic data and annotations are available through the JGI Integrated Microbial Genomes & Microbiomes (IMG/M) portal. Additionally, all 410 genomes sequenced in the study can be downloaded via a dedicated portal.
The Hungate1000 project was funded by the New Zealand Government in support of the Livestock Research Group of the Global Research Alliance on Agricultural Greenhouse Gases. The genome sequencing and analysis component of the project was supported by the JGI through the Community Science Program, and used resources of the National Energy Research Scientific Computing Center (NERSC), which is supported by the Office of Science of the U.S. Department of Energy.
Reference: Seshadri R et al. Cultivation and sequencing of rumen microbiome members from the Hungate1000 Collection. Nat Biotechnol. 2018 Mar 19. doi:10.1038/nbt.4110
About NERSC and Berkeley Lab
The National Energy Research Scientific Computing Center (NERSC) is a U.S. Department of Energy Office of Science User Facility that serves as the primary high performance computing center for scientific research sponsored by the Office of Science. Located at Lawrence Berkeley National Laboratory, NERSC serves almost 10,000 scientists at national laboratories and universities researching a wide range of problems in climate, fusion energy, materials science, physics, chemistry, computational biology, and other disciplines. Berkeley Lab is a DOE national laboratory located in Berkeley, California. It conducts unclassified scientific research and is managed by the University of California for the U.S. Department of Energy. »Learn more about computing sciences at Berkeley Lab.