Introduction to single-cell RNA-seq data analysis

4th, 11th and 18th Nov 2021

Taught remotely

Bioinformatics Training, Craik-Marshall Building, Downing Site, University of Cambridge

Instructors

Abigail Edwards - Bioinformatics Core, Cancer Research UK Cambridge Institute
Ashley Sawle - Bioinformatics Core, Cancer Research UK Cambridge Institute
Hugo Tavares - Bioinformatics Training Facility, University of Cambridge
Katarzyna Kania - Genomics Core, Cancer Research UK Cambridge Institute
Stephane Ballereau - Bioinformatics Core, Cancer Research UK Cambridge Institute

Helpers:

Chandra Chilamakuri - Bioinformatics Core, Cancer Research UK Cambridge Institute
Chloe Pacyna - Wellcome Sanger Institute
Jon Price - The Gurdon Institute, University of Cambridge
Karsten Bach - Department of Pharmacology, University of Cambridge

Outline

This workshop is aimed at biologists interested in learning how to perform standard single-cell RNA-seq analyses.

This will focus on the droplet-based assay by 10X genomics and include running the accompanying cellranger pipeline to align reads to a genome reference and count the number of read per gene, reading the count data into R, quality control, normalisation, data set integration, clustering and identification of cluster marker genes, as well as differential expression and abundance analyses. You will also learn how to generate common plots for analysis and visualisation of gene expression data, such as TSNE, UMAP and violin plots.

We have run this course twice and are still learning how to teach it remotely. Please bear with us if there are any technical hitches, and be aware that timings for different sections laid out in the schedule below may not be adhered to. There may be some necessity to make adjusments to the course as we go.

(Materials linked to below will be updated closer to the time of delivery)

Prerequisites

Some basic experience of using a UNIX/LINUX command line is assumed

Some R knowledge is assumed and essential. Without it, you will struggle on this course. If you are not familiar with the R statistical programming language we strongly encourage you to work through an introductory R course before attempting these materials. We recommend our Introduction to R course

Data sets

Two data sets:

'CaronBourque2020': pediatric leukemia, with four sample types, including:
- pediatric Bone Marrow Mononuclear Cells (PBMMCs)
- three tumour types: ETV6-RUNX1, HHD, PRE-T
'HCA': adult BMMCs (ABMMCs) obtained from the Human Cell Atlas (HCA)
- (here downsampled from 25000 to 5000 cells per sample)

Tentative schedule

Tentative schedule for a 3-day course.

(long sessions include breaks)

Day 1: Thursday 4th Nov

09:30 - 09:40 Welcome
09:40 - 10:25 Introduction - Katarzyna Kania
- Slides
10:25 - 10:30 5 min break
10:30 - 10:40 Preamble: data set and workflow - Stephane Ballereau
- Slides
10:40 - 12:30 Library structure, cellranger for alignment and cell calling - Stephane Ballereau
- Slides
- Alignment with Cell Ranger
12:30 - 13:30 lunch break
13:30 - 17:30 QC and exploratory analysis - Ashley Sawle
- Slides (pdf)
- QC and preprocessing
- Exercise

Day 2: Thursday 11th Nov

09:30 - 09:40 Recap
09:40 - 12:30 Normalisation - Stephane Ballereau
12:30 - 13:30 lunch break
13:30 - 15:25 Feature selection and dimensionality reduction - Hugo Tavares
- Slides
- Materials
15:25 - 15:35 10 min break
15:35 - 17:30 Batch correction and data set integration - Abigail Edwards

Day 3: Thursday 18th Nov

09:30 - 09:40 Recap
09:40 - 11:05 Clustering - Stephane Ballereau
11:05 - 11:15 10 min break
11:15 - 12:30 Identification of cluster marker genes - Hugo Tavares
- Slides
- Cluster marker genes
- Worksheet in Exercises/09_ClusterMarkerGenes.R
12:30 - 13:30 lunch break
13:30 - 15:25 Differential expression between conditions - Stephane Ballereau
15:25 - 15:35 10 min break
15:35 - 17:30 Differential abundance between conditions - Stephane Ballereau

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

index.md

index.md

Introduction to single-cell RNA-seq data analysis

4th, 11th and 18th Nov 2021

Taught remotely

Bioinformatics Training, Craik-Marshall Building, Downing Site, University of Cambridge

Instructors

Outline

Prerequisites

Data sets

Tentative schedule

Day 1: Thursday 4th Nov

Day 2: Thursday 11th Nov

Day 3: Thursday 18th Nov

Files

index.md

Latest commit

History

index.md

File metadata and controls

Introduction to single-cell RNA-seq data analysis

4th, 11th and 18th Nov 2021

Taught remotely

Bioinformatics Training, Craik-Marshall Building, Downing Site, University of Cambridge

Instructors

Outline

Prerequisites

Data sets

Tentative schedule

Day 1: Thursday 4th Nov

Day 2: Thursday 11th Nov

Day 3: Thursday 18th Nov