A guide to the technology, analysis workflows, tools, and resources for next-generation sequencing data analysis.
This course will provide insights and training into how biological knowledge can be derived from genomics experiments and explain different approaches in analysing such data. The main focus will be on introducing sequence informatics, re-sequencing, and variant calling during the analysis of higher-eukaryotes, with an emphasis on human genetic research. Throughout the week, more advanced topics will introduce the creation of pipelines, automation, and the scaling-up of analysis experiments.
Practical sessions will be run on datasets prepared by the trainers, not on personal research data. Participants will learn how to process these training datasets and to apply appropriate statistical methods in their analyses.
Who is this course for?
The course is aimed at PhD students and post-doctoral researchers who are starting to use high-throughput sequencing technologies and bioinformatics methods in their research. The content is most applicable to those working with eukaryotic genomes, especially in the area of human genomics.
Participants will require a basic knowledge of the Unix command line in order to adequately complete the practical sessions. A short pre-course session will be offered. Additionally, we recommend this free tutorial or other similar ones: Basic introduction to the Unix environment.
Please note that participants without basic knowledge of these resources will have difficulty completing the practical sessions.
What will I learn?
After the course, participants will be able to:
- State the advantages and limitations of short- and long-read sequencing technologies
- Apply appropriate quality control (QC) and aligners to unassembled short- and long-reads
- Perform variant calling analysis and annotation
- Scale up and automate simple genomics pipelines
- Access genomic datasets from online public resources
Course content
During this course you will learn about:
- Quality control methods for cleaning raw sequencing data
- Alignment of reads to a reference genome
- File format conversion and processing
- Tools for variant calling (both single nucleotide and copy number analysis)
- Approaches for scaling up and reproducible research
- Methodologies for variant annotation
- Resources for genomic data:
- European Variation Archive
- European Nucleotide Archive
- Ensembl
Trainers
Chiara Batini, University of Leicester
Kayesha Coley, University of Leicester
Maira Ihsan, EMBL-EBI
Sean Laidlaw, Wellcome Sanger Institute
Charles Solomon, University of Leicester
Maxime Tarabichi, Université Libre de Bruxelles
Course fee: £825.00 inclusive of four nights accommodation and catering, including dinner
Application deadline: 7 August 2023