Microbial ecology is one of many fields that have benefitted greatly from technical advances in DNA sequencing. In particular, low-cost culture-independent sequencing has made metagenomic and metatranscriptomic surveys of microbial communities practical, including bacteria, archaea, viruses, and fungi associated with the human body, other hosts, and the environment. The resulting data have stimulated the development of new computational approaches to meta'omic sequence analysis, including metagenomic assembly, microbial identification, and gene, transcript, and pathway functional profiling.
We will present a high-level introduction to computational metagenomics, highlighting the state-of-the-art in the field as well as outstanding challenges. These include an introduction to the biological goals of typical meta'omic studies and the bioinformatic processes currently available to achieve them. This will briefly summarize the major aspects of metagenomic analysis to be covered here: reference genome-based community composition and functional profiling, along with methods for constructing new genomic references by using de novo assembly. We will discuss the challenges associated with precisely quantifying members of a microbial community and functional analysis of the gene families in a community, the association of those gene families with their source organisms, and the combination of gene families into pathways for metabolic profiling.
Finally, as sequencing technologies deliver more data for the same price, our ability to examine complex microbial communities using sequencing grows. For environmental communities, many fewer reference genomes or transcriptomes are typically available than for human- associated microbes, and the substantial diversity of many communities means that terabases of sequencing may be needed to recover a significant fraction of the community metagenome. We will introduce large-scale de novo assembly, reference free methods for investigating community coverage, and diversity estimation for shotgun sequencing data. We will conclude with an overview of the statistical challenges inherent to analyzing the compositional and count data arising meta'omic studies, and present Bioconductor solutions for simplifying these analyses. The workshop will include standardized protocols for microbial profiling, functional profiling, and metagenome/metatranscriptome assembly with benchmarks and examples.