Durban University of Technology and J. Craig Venter Institute
INTRODUCTION TO MICROBIOME STUDIES AND MICROBIOME DATA ANALYSIS Durban University of Technology, Durban, South Africa
April 21 & 22, 2016
Programme Durban University of Technology National Institute of Allergy and Infectious Diseases (NIAID) J. Craig Venter Institute (JCVI)
Disclaimer: The views expressed in written conference materials or publications and by speakers and moderators at HHS‐sponsored conferences do not necessarily reflect the official policies of the Department of Health and Human Services (HHS), nor does mention of trade names, commercial practices, or organizations imply endorsement by the U.S. Government.
Organizations
About Durban University of Technology With approximately 23 000 students, the Durban University of Technology (DUT) is the first choice for higher education in KwaZulu‐Natal (KZN). It is located in the beautiful cities of Durban and Pietermaritzburg (PMB). As a University of Technology, it prioritises the quality of teaching and learning by ensuring its academic staff possess the highest possible qualification that they can get. The Durban University of Technology is a result of the merger in April 2002 of two prestigious technikons, ML Sultan and Technikon Natal. It was named the Durban Institute of Technology and later became the Durban University of Technology in line with the rest of the universities of technology. DUT, a member of the International Association of Universities, is a multi‐campus university of technology at the cutting edge of higher education, technological training and research. The university aspires to be a “preferred university for developing leadership in technology and productive citizenship”, and to “making knowledge useful”. As a butterfly develops from a pupa, so have the students at our institution. From the moment they register as green freshers, to their capping at the hallowed graduation ceremony, our students undergo an intellectual evolution. Website: www.dut.ac.za
J. Craig Venter Institute (JCVI) The J. Craig Venter Institute was formed in October 2006 through the merger of several affiliated and legacy organizations — The Institute for Genomic Research (TIGR) and The Center for the Advancement of Genomics (TCAG), The J. Craig Venter Science Foundation, The Joint Technology Center, and the Institute for Biological Energy Alternatives (IBEA). Today all these organizations have become one large multidisciplinary genomic‐focused organization. With approximately 200 scientists and staff, more than 250,000 square feet of laboratory space, and locations in Rockville, Maryland and San Diego, California, the new JCVI is a world leader in genomic research. Infectious Disease As the population on our planet continues to increase, and as issues like global climate change arise, the threat of emerging infectious diseases intensifies. One of the longstanding research focus areas at the JCVI is microbial and viral genomics and how those relate to human infectious disease. That work continues today in several major areas including: comparative microbial genomics in sexually transmitted pathogens, elucidation of human microbial flora within various body cavities, sequencing and analysis of human pathogens such as anthrax and the mosquito species that carry yellow fever and malaria, and various strains of influenza and coronavirus. A thorough genomic understanding of these diseases will enable JCVI researchers to collaborate on new vaccines and treatments for these global health threats. Website: www.jcvi.org
National Institute of Allergy and Infectious Disease (NIAID) Background NIAID’s Division of Microbiology and Infectious Diseases (DMID) is a pioneer in using genomics technologies to study infectious diseases, supporting successful sequencing of thousands of microbial genomes including influenza, dengue, Bacillus anthracis, Plasmodium falciparum, Aspergillus fumigatus, Mycobacteria tuberculosis, and Staphylococcus aureus. Today, sequencing a bacterial genome costs less than a dollar, yet analysis may cost as much as tens of thousands of dollars. DMID has been expanding genomics activities over the last decade to provide the scientific community with genomic data as well as resources such as reagents, databases, software, and computational tools that are essential for analyzing and applying research findings. Data and resources generated through the genomics initiatives are rapidly made available to the scientific community. In recent years, the field of metagenomics has emerged as a complementary approach to studying microbes. By studying microbial communities inhabiting a particular human body site, metagenomics is insight into their role in health and disease without culturing individual enabling scientists to gain microbes. Sequence data, combined with other biochemical and microbiological information, is being used to improve detection of pathogens, diagnose infectious diseases, and identify potential new targets for novel drugs and vaccines. In addition, comparing the sequences of different strains, species, and clinical isolates is crucial for identifying genetic polymorphisms that correlate with phenotypes such as drug resistance, morbidity, and infectivity. Current NIAID/DMID Genomics Programs The current Genomics Programs provide comprehensive genomics, functional genomics, proteomics, structural genomics, bioinformatics and other ‘omics’ resources and reagents to the scientific community for basic and applied research in infectious diseases. The ultimate goal of the Program is to facilitate understanding of biology of the pathogen, pathogenesis, pathogen‐host interaction, and to develop potential new targets and platforms for discovering novel drugs, vaccines and diagnostics. The components of the Genomics Programs share certain characteristics: they probe the enormous complexity and diversity of biological systems; they use techniques that can examine many thousands of separate biological entities or phenomena in a single experiment; and they generate new data on an unprecedented scale, all of which must be stored, curated, analyzed, and shared to the greatest extent possible. Website: www.niaid.nih.gov
Workshop Presenters J. Craig Venter Institute (JCVI)
Andres Gomez, Ph.D. Staff Scientist Human Biology Group
[email protected]
Derek Harkins Senior Bioinformatics Analyst Infectious Diseases Group
[email protected]
Alexander Voorhies, Ph.D. Post‐doctoral Fellow Infectious Diseases Group
[email protected]
Logistic Information Venue Durban University of Technology, Durban, South Africa Meals The following meals will be provided for workshop participants. Thursday April 21, 2016 Lunch & Tea Friday April 22, 2016 Tea
INTRODUCTION TO MICROBIOME STUDIES AND DATA ANALYSIS Durban University of Technology, Durban, South Africa Dates: April 21 & 22, 2016 Workshop Facilitators & Presenters: Dr. Andres Gomez, J. Craig Venter Institute Derek Harkins, J. Craig Venter Institute Dr. Alexander Voorhies, J. Craig Venter Institute Workshop Programme: April 21, 2016 AM Session Location: Coastlands Musgrave 8:30 – 9:00 Arrival and Registration Tea 9:00 – 9:15 Dr. Suren Singh Welcome and Opening Remarks Outline of Objectives for the Workshop 9:15 – 10:15 Dr. Andres Gomez Overview on Microbial Ecology and Microbiome Studies Experimental and Technical Considerations in Microbiome Studies Study Design Sample Collection and Preservation DNA extraction Phylogenetic Markers (16S) 16S vs. Metagenomic Studies (Composition vs. Potential Function) Library Construction 10:15 – 11:00 Derek Harkins Sequencing and Informatics Sequencing Technologies 16S rRNA Data Analysis Tools Taxonomic Classification and Databases ‐ the OTU Table Compute Infrastructure Pipelines Used at JCVI 11:00 – 11:30 Tea 11:30 – 1:00 Dr. Alexander Voorhies Software setup (downloading and installing software)
Download and install mothur o http://www.mothur.org/wiki/Download_mothur Download and install R o https://cran.r‐project.org/bin/windows/base/
Hands on data processing introduction
Introduction to mothur Basic data organization and input files Processing a small dataset that can be run on a local installation of mothur
1:00 – 1:45 Lunch April 21, 2016 PM Session Location: Coastlands Musgrave 1:45 – 3:30 Hands on data processing introduction (continued) Introduction to mothur Basic data organization and input files Processing a small dataset that can be run on a local installation of mothur 3:30 – 4:00 Tea 4:00 – 5:30 Dr. Andres Gomez Hands on statistical analysis of microbiome data A Case Study The OTU Table and Metadata Introduction to R (a free software tool for statistical analyses) Useful R Packages for Ecological and Microbiome Analyses Discovering Patterns in Microbiome Data o Alpha Diversity 5:30 Adjourn
April 22, 2016 AM Session Location: Faculty of Applied Sciences Computer Lab: Block S9 Level 3 8:30 – 9:00 Arrival & Tea 9:00 – 11:30 Dr. Andres Gomez Hands on statistical analysis of microbiome data (continued) Discovering Patterns in Microbiome Data o Beta Diversity o Distance Matrices and Ordination Techniques o Marker Discovery Methods 11:30 Adjourn
Disclaimer: The views expressed in written conference materials or publications and by speakers and moderators at HHS‐ sponsored conferences do not necessarily reflect the official policies of the Department of Health and Human Services (HHS), nor does mention of trade names, commercial practices, or organizations imply endorsement by the U.S. Government.
REFERENCE LIST INTRODUCTION TO MICROBIOME STUDIES AND MICROBIOME DATA ANALYSIS 1. Human Microbiome Project Consortium. "Structure, function and diversity of the healthy human microbiome." Nature 486.7402 (2012): 207‐214. 2. Turnbaugh, Peter J., et al. "An obesity‐associated gut microbiome with increased capacity for energy harvest." nature 444.7122 (2006): 1027‐131. 3. La Rosa, Patricio S., et al. "Hypothesis testing and power calculations for taxonomic‐based human microbiome data." PloS one 7.12 (2012): e52078. 4. Kelly, Brendan J., et al. "Power and sample‐size estimation for microbiome studies using pairwise distances and PERMANOVA." Bioinformatics (2015): btv183. 5. Hale, Vanessa L., et al. "Effect of preservation method on spider monkey (Ateles geoffroyi) fecal microbiota over 8weeks." Journal of microbiological methods 113 (2015): 16‐26. 6. Wesolowska‐Andersen, Agata, et al. "Choice of bacterial DNA extraction method from fecal material influences community structure as evaluated by metagenomic analysis." Microbiome 2.1 (2014): 1. 7. Yuan, Sanqing, et al. "Evaluation of methods for the extraction and purification of DNA from the human microbiome." PloS one 7.3 (2012): e33865. 8. Lane, David J., et al. "Rapid determination of 16S ribosomal RNA sequences for phylogenetic analyses." Proceedings of the National Academy of Sciences 82.20 (1985): 6955‐6959. 9. Pace, Norman R. "A molecular view of microbial diversity and the biosphere."Science 276.5313 (1997): 734‐740. 10. Hamady, Micah, and Rob Knight. "Microbial community profiling for human microbiome projects: Tools, techniques, and challenges." Genome research19.7 (2009): 1141‐1152. 11. Venter, J. Craig, et al. "Environmental genome shotgun sequencing of the Sargasso Sea." science 304.5667 (2004): 66‐74.
16S rRNA sequence analysis 1. Caporaso, JG, Justin Kuczynski, Jesse Stombaugh, Kyle Bittinger, Frederic D Bushman, Elizabeth K Costello, Noah Fierer, Antonio Gonzalez Pena, Julia K Goodrich, Jeffrey I Gordon, Gavin A Huttley, Scott T Kelley, Dan Knights, Jeremy E Koenig, Ruth E Ley, Catherine A Lozupone, Daniel McDonald, Brian D Muegge, Meg Pirrung, Jens Reeder, Joel R Sevinsky, Peter J Turnbaugh, William A Walters, Jeremy Widmann, Tanya Yatsunenko, Jesse Zaneveld and Rob Knight; Nature Methods, 2010; doi:10.1038/nmeth.f.303 2. Caporaso JG, Lauber CL, Walters WA, Berg‐Lyons D, Lozupone CA, Turnbaugh PJ, Fierer N, Knight R. Global patterns of 16S rRNA diversity at a depth of millions of sequences per sample. Proc Natl Acad Sci U S A. 2011 Mar 15;108 Suppl 1:4516‐22. doi: 10.1073/pnas.1000080107. Epub 2010 Jun 3. 3. Chakravorty, Soumiteh, Danica Helb, Michele Burday, Nancy Connell and David Alland. A detailed analysis of 16S ribosomal RNA gene segments for the diagnosis of pathogenic bacteria. Microbiol Methods, 2007 May; 69(2): 330‐339 4. DeSantis, T. Z., P. Hugenholtz, N. Larsen, M. Rojas, E. L. Brodie, K. Keller, T. Huber, D. Dalevi, P. Hu, and G. L. Andersen. 2006. Greengenes, a Chimera‐Checked 16S rRNA Gene Database and Workbench Compatible with ARB. Appl Environ Microbiol 72:5069‐72. 5. Edgar, R.C. (2013) UPARSE: Highly accurate OTU sequences from microbial amplicon reads, Nature Methods [Pubmed:23955772, dx.doi.org/10.1038/nmeth.2604]. 6. Hildebrand F, Tadeo RY, Voigt AY, Bork P, Raes J. 2014. LotuS: an efficient and user‐friendly OTU processing pipeline. Microbiome 2: 30. 7. Kim, O.S., Cho, Y.J., Lee, K., Yoon, S.H., Kim, M., Na, H., Park, S.C., Jeon, Y.S., Lee, J.H., Yi, H., Won, S., Chun, J. (2012). Introducing EzTaxon: a prokaryotic 16S rRNA Gene sequence database with phylotypes that represent uncultured species. Int J Syst Evol Microbiol 62, 716–721. 8. Schloss, PD, Westcott SL, Ryabin, T., Hall JR, Hartmann M, Hollister EB, Lesniewski, RA, Oakley BB, Parks DH, Robinson CJ, Sahl JW, Stres, B., Thallinger GG, Van Horn DJ, Weber CF. Introducing mothur: Open‐source, platform‐independent, community‐supported software for describing and comparing microbial communities. Appl Environ Microbiol, 2009. 75(23):7537‐41 9. Yarza, Pablo, Pelin Yilmaz, Elmar Pruesse, Frank Oliver Glöckner, Wolfgang Ludwig, Karl‐Heinz Schleifer, William B. Whitman, Jean Euzéby, Rudolf Amann & Ramon Rosselló‐Móra. Uniting the classification of cultured and uncultured bacteria and archaea using 16S rRNA gene sequences. Nature Reviews Microbiology 12, 635–645 (2014)
HANDS ON STATISTICAL ANALYSES OF MICROBIOME DATA 1. Borcard Daniel, Francois Gillet, and Pierre Legendre. “Numerical Ecology with R.” Springer, NY (2011). 2. Roberts, David W., and Maintainer David W. Roberts. "Package ‘labdsv’." (2015). http://ecology.msu.montana.edu/labdsv/R 3. Oksanen, Jari, et al. "The vegan package." Community ecology package 10 (2007). http://vegan.r‐forge.r‐project.org/ 4. Paradis, Emmanuel, Julien Claude, and Korbinian Strimmer. "APE: analyses of phylogenetics and evolution in R language." Bioinformatics 20.2 (2004): 289‐290. https://cran.r‐ project.org/web/packages/ape/index.html 5. Warnes, Gregory R., et al. "gplots: Various R programming tools for plotting data." R package version 2.4 (2009). https://cran.r‐project.org/web/packages/gplots/index.html 6. Morgan, Xochitl C., and Curtis Huttenhower. "Human microbiome analysis."PLoS Comput Biol 8.12 (2012): e1002808 7. Alpha Diversity Measures. http://scikit‐bio.org/docs/latest/generated/skbio.diversity.alpha.html