Introduction to Bioconductor

Monday, June 1, 2020 - Tuesday, June 30, 2020

Download the syllabus for this course

Download the flyer for this course

Course Description

The Bioconductor project provides open-source software based on the R programming language for statistical analysis and visualization of high-throughput genomic data. This course provides a broad introduction to the project, from navigating its large collection of packages to its core functionality for representation, manipulation, and visualization of genomic data. We will learn how to efficiently analyze genomic intervals and SNPs, how to manage experiments of one or more genomic data type with clinical and pathological data, and how to visualize genomic data. This workshop equips participants with essential background for a wide range of applications in statistical genomics and genetic epidemiology, such as GWAS, RNA-seq, DNA methylation, ChIP-seq, metagenomics, and multi'omic experiments.

Course Objectives

Part 1: Bioconductor

  • Find, install, and learn how to use Bioconductor packages
  • Import and manipulate genomic files and Bioconductor data objects
  • Start an RNA-seq differential expression workflow

Part 2: Data structures for representing 'omics experiments

  • Use the ExpressionSet data structure to represent, manipulate, and analyze microarray data
  • Use the SummarizedExperiment data structure to represent, manipulate, and analyze RNA-seq data
  • Understand similarities and differences between the two data structures
  • Create both data structures from public data resource
  • Use the MultiAssayExperiment data structure to coordinate multi'omics experiments

Part 3: GenomicRanges

  • Understand how to apply the *Ranges infrastructure to solve common bioinformatic challenges in genomic research
  • Gain insight into the design principles of the infrastructure and how it is meant to be used
  • Learn basics of genomic region algebra and how to carry out intra- and inter-region operations

Part 4: Visualizing genomic data

  • Understand basic principles of the grammar of graphics used in R/Bioconductor
  • Learn how to display heatmaps for genomic data exploration
  • Learn how to display genomic data tracks in a genome browser view


This workshop is accessible for those with little or no experience using Bioconductor, although even more experienced users can benefit from the broad overview of Bioconductor paradigms. The workshop assumes elementary knowledge of R, which can be gained in advance or simultaneously from other courses such as the introductory course from DataCamp. A basic understanding of genome biology and statistical analysis is helpful, but specific prerequisites are not needed.

Technical Requirements


Levi Waldron, PhD
Levi Waldron completed a PhD at the University of Toronto and a post-doc in the Huttenhower lab at the Harvard Chan School of Public Health, and is an Associate Professor of Biostatistics at the City University of New York School of Public Health at Hunter College. He was teaching Applied Statistics for High-Throughput Biology as a 2015-16 U.S. Fulbright Scholar and visiting professor at the University of Trento, Italy. He is member of the Bioconductor Technical Advisory Board, developer of core Bioconductor infrastructure for multi-omics data analysis, and is part of an effort to sequence the oral microbiome of a representative sample of New York City as part of the NYC-HANES II project.

Ludwig Geistlinger, PhD

Ludwig Geistlinger is a post-doctoral fellow in cancer genomics in the lab of Levi Waldron.
His research interests are in computational biology and biostatistics, focusing on the field of functional enrichment analysis of high-throughput genomic assay data.
Prior to his work at CUNY ISPH, he completed a PhD on network-based analysis of gene expression data at the University of Munich, Germany, and a post-doctoral fellowship at the University of São Paulo, Brazil, where he analyzed the effects of structural genome variation on gene expression.
He designs and implements methods for the analysis of large-scale genomic assay data to improve the understanding of molecular mechanisms underlying specific cancer types. This also includes assessment of the clinical relevance of molecular cancer subtypes, especially whether their incorporation in personalized healthcare could improve treatment and clinical outcome.

Course Fee

Late registration discount before May 1, 2020: $250.00
After May 1, 2020: $250.00


The registration period has closed for this event.

Online Course Format

This is a short digital course, equivalent to approximately 5 hours of classroom instruction. Lectures and course material will be presented online. The flexible format will include video or audio recordings of lecture material, file sharing and topical discussion fora, self-assessment exercises, real-time electronic office hours and access to instructors for feedback during the course. Registrants for EPIC digital courses should have high-speed internet access. Any additional information about technical requirements and access to the course will be provided the month before the course begins.

Share This