Data Processing for Next Generation Sequencing

Logo

2020 CRUK CC Bioinformatics Summer School

View the Project on GitHub ss-lab-cancerunit/NGSdataProcessing

Outline

In this part of the course we will learn NGS data quality assessment, common file formats, and principles of short reads alignment.

Prerequisites

All the bioinformatic tools we will use in this tutorial require some basic experience of using UNIX/LINUX command line. Though majority of our practicals can be followed by simply copying commands to the terminal window, we strongly encourage you to look at an introductory command line course before starting analysing your own sequencing data, for example: Introduction to the Command Line for Genomics

Course etiquette

In order to run the course smoothly as possible, we all need to follow a few simple rules:

  1. Please mute your microphone.
  2. To get help from a tutor, please click the “Raise Hand” button in Zoom. This can be found on the “Participants” button. A tutor will then contact you in the chat. If necessary, you and the tutor can be moved to a breakout room when you can discuss your issue in more detail.
  3. Please ask any general question by typing in into the Google Doc mentioned above.
  4. During practicals, when you are done, please press the green “Yes” button. This way we will know when we can move on.

Course materials

09:40 - 10:20 Introduction to next generation sequencing and common file formats
L1 slides

10 min break

10:30 - 11:20 Quality control and artefact removal
L2 slides
Practical

10 min break

11:30 - 12:30 Short reads alignment
L3 slides
Practical

Authors

Joanna A. Krupka
MRC Cancer Unit / Department of Haematology Wellcome-MRC Cambridge Stem Cell Institute University of Cambridge
Shoko Hirosue
MRC Cancer Unit University of Cambridge
Shamith Samarajiwa
MRC Cancer Unit University of Cambridge
Dora Bihary
MRC Cancer Unit University of Cambridge