GBIO0002-1: Genetincs and bioinformatics 2015-16

In this course genetic concepts are introduced that are necessary to understand a selection of bioinformatics related data analysis problems. To solve these problems a variety of analytic tools will be explained and exemplified.

The course is in part based on interactive ex-cathedra lectures and in part on interactive practical sessions. The exercise sessions allow students to become familiar with the theoretical concepts introduced during the theory classes. They prepare students to successfully carry out their homework assignments.

Exam information

The exam is open book. This written exam will count for 35% of the final mark.

Date 14 Jan 2016 (Thursday)
Location Room R0.7 (B28)
Time 08:30 am


Preliminary marks HW 1,2,3 are posted at the HW submission website. Please login to view.

Course schedule and location

Current course schedule (will be uploaded soon)

Start date15 Sept 2015 (Tuesday)
Location Room R.21, B28
Time from 14:00 til 18:00
Professors Kristel Van Steen and Franck DEQUIEDT


• 15 Sept 2015. Bring PC/laptop to the 1st class to install R and practially follow the lecture.

• 22 Sept 2015. The main lecture will start at 14.30 instead of 14. Between 14h-14.30h I will organize an optional paper reading session on the two posted papers.

• 19 Sept 2015. This Tuesday between 13-14h we will have questions and answers session related to R statistical language and HW1 assignment.

• 02 Nov 2015. This Nov 3rd class is a review class where students will be able to ask questions to two Professors and TA. Class starts at 14h.

• 07 Nov 2015. The review class was an exciting one. Thank you for all students that came and asked questions and pointed our errors. We hope to organize another one on Dec 1st. We were impressed by relevance of your questions!

• 12 Nov 2015. Please consider going to Research Summer School in Statistical Omics 2016 in Croatia in Aug 2016. Some costs are covered. At least apply. More information is found HERE

• 17 Nov 2015. HW2 Style2 part 2 of 2 was updated. The new code dealing with p-value of 0 was added. See page 4of4

Assignment submission

Please submit your assignments through the online system. If deadline passed, email your assignment to the course TA.


Course material

September 15, 2015 - the 1st class Class will start at 14h (Kristel)

Note: Please bring a PC to the 1st class to install R

pdf Intro lecture: Administration

pdf Document: Organization of GBIO0002 Homework Assignments (Kristel)

pdf Lecture 1: Setting the pace (Kristel)

   paper Supporting paper 1 (Fuller2013): Biggest challenges in bioinformatics (Kristel)

   paper Supporting paper 2 (Schattner2009): Genomics made easier: An introductory tutorial to genome datamining (Kristel)

   paper A reference list of bioinformatics books

ppt Practical Lecture 1: Intro into biological DBs and R (Kirill)

September 22, 2015 The main class will start at 14.30h

pdf In class reading: Primer on Medical Genomics. Part I: History of Genetics and Sequencing of the Human Genome

pdf Lecture 2: Genetics and Genetic Markers (Franck)

September 29, 2015 class starts at 14h

pdf Lecture 3: Genome-wide Association Studies (Kristel)

Background reading (in-class reading – hence for the exam):

  1. Anderson, Carl A., et al. "Data quality control in genetic case-control association studies." Nature protocols 5.9 (2010): 1564-1573.
  2. Balding, David J. "A tutorial on statistical methods for population association studies." Nature Reviews Genetics 7.10 (2006): 781-791

Guiding questions to help with paper understanding:Anderson2010 questions and Balding2006 questions

Supporting documents: 1)Genome-Wide Association Studies: A Primer; 2)A vision for the future of genomics research

HW1 - Principles of molecular biology and GWAS (due Oct 27th)

Style 1: Literature-based (the focus is on concept understanding)

Part 1: Genome Structure (choose one paper)

  1. Salgotra, R. K., B. B. Gupta, and C. N. Stewart Jr. "From genomics to functional markers in the era of next-generation sequencing." Biotechnology letters 36.3 (2014): 417-426.
  2. Patwardhan, Anand, Samit Ray, and Amit Roy. "Molecular Markers in Phylogenetic Studies-A Review." Journal of Phylogenetics & Evolutionary Biology (2014).
  3. Kayser, Manfred, and Peter de Knijff. "Improving human forensics through advances in genetics, genomics and molecular biology." Nature Reviews Genetics 12.3 (2011): 179-192.
  4. Snyder, Michael, and Mark Gerstein. "Defining genes in the genomics era." Science 300.5617 (2003): 258-260.
  5. Gerstein, Mark B., et al. "What is a gene, post-ENCODE? History and updated definition." Genome research 17.6 (2007): 669-681.

Part 2: GWAS (choose one paper)

  1. Gibson (2012) Rare and common variants: twenty arguments
  2. Peloso et al (2011) Choice of population structure informative principal components for adjustment in a case-control study

Style 2: Q&A-based

pdf HW1 part 1 - QandA: Genome Structure and Mapping

pdf HW1 part 2 - QandA:Genome-wide Association Studies

Load into R this r-data object: HW1_WS.Rdata

October 6, 2015 Class will start at 13.30h

Note: please bring one charged PC per group to follow in-class GenABEL tutorial

pdf Lecture 4: Practical aspects of Genome-wide Association Studies

October 13, 2015 Class will start at 13.30h (Franck)

pdf Lecture 5: Principles of DNA sequencing

October 20, 2015 The main class will start at 14h (Kristel)

pdf Lecture 6: I have my DNA sequences, now what?

Exam material reading: Bansal, Vikas, et al. "Statistical analysis strategies for association studies involving rare variants." Nature Reviews Genetics 11.11 (2010): 773-785

Supporting papers on various topics (not an exam material)

  1. Regulatory DNA sequence motifs:D'haeseleer, Patrik. "What are DNA sequence motifs?." Nature biotechnology 24.4 (2006): 423-425
  2. Principal Component Analysis in a GWAS context:Ringnér, Markus. "What is principal component analysis?." Nature biotechnology 26.3 (2008): 303-304
  3. Aggregation of rare variants / collapsing of genomic data:Dering, Carmen, et al. "A comprehensive evaluation of collapsing methods using simulated and real data: excellent annotation of functionality and large sample sizes required." Frontiers in genetics 5 (2014).
  4. Population stratification:Zhang, Yiwei, Weihua Guan, and Wei Pan. "Adjustment for population stratification via principal components in association analysis of rare variants." Genetic epidemiology 37.1 (2013): 99-109

HW1 supporting reading material from the analytic point of view (3 papers - focusing on rare variant association analysis)

  1. Auer, Paul L., and Guillaume Lettre. "Rare variant association studies: considerations, challenges and opportunities." Genome medicine 7.1 (2015): 16
  2. Lee, Seunggeung, et al. "Rare-variant association analysis: study designs and statistical tests." The American Journal of Human Genetics 95.1 (2014): 5-23
  3. Moutsianas, Loukas, et al. "The Power of Gene-Based Rare Variant Methods to Detect Disease-Associated Variation and Test Hypotheses About Complex Disease." (2015): e1005165.

October 27, 2015 Class will start at 13.30h (Kirill)

Note: please bring a charged PC to follow in-class mini tutorials

pdf Lecture 7: Biological sequences (PowerPoint)

pdf Lecture 7: Biological sequences (PDF)

HW2-Biological sequences (due Nov 20th)

Style 1: Literature-based

Part 1: DNA Sequencing (choose one paper)

  1. Shen, Tony, et al. "Clinical applications of next generation sequencing in cancer: from panels, to exomes, to genomes." Frontiers in genetics 6 (2015).
  2. van Dijk, Erwin L., et al. "Ten years of next-generation sequencing technology." Trends in genetics 30.9 (2014): 418-426. (choose ONE technology and explain it)
  3. Mwenifumbo, Jill C., and Marco A. Marra. "Cancer genome-sequencing study design." Nature Reviews Genetics 14.5 (2013): 321-332. (this a big article, with many aspects. students should only develop ONE part)

Part 2: Sequences analysis (choose one paper)

  1. Lee, Je Hyuk, et al. "Highly multiplexed subcellular RNA sequencing in situ." Science 343.6177 (2014): 1360-1363.
  2. Auer, Paul L., and Guillaume Lettre. "Rare variant association studies: considerations, challenges and opportunities." Genome medicine 7.1 (2015): 16.
  3. Liu, Lin, et al. "Comparison of next-generation sequencing systems." BioMed Research International 2012 (2012).

Style 2: Q&A-based

pdf HW2 part 1 - QandA: DNA Sequencing

pdf HW2 part 2 - QandA: Sequences analysis

November 3, 2015 Class starts at 14h (Kristel+Kirill+Franck)

The question and answer session. Bring your questions related to course material and homework. No formal lecture

November 10, 2015 Class starts at 13.30h (Franck)

pdf Lecture 8: Principles of gene expression

November 17, 2015 Class starts at 14h (Kristel)

pdf Lecture 9: Statistical interactions

EXAM reading:

  1. Boulesteix, Anne-Laure, et al. "Letter to the Editor: On the term ‘interaction’and related phrases in the literature on Random Forests." Briefings in bioinformatics (2014): bbu012.

Supporting lecture papers (non-exam material):

  1. Lee, Seunggeung, et al. "Rare-variant association analysis: study designs and statistical tests." The American Journal of Human Genetics 95.1 (2014): 5-23.
  2. Winham, Stacey J., et al. "SNP interaction detection with Random Forests in high-dimensional genetic data." BMC bioinformatics 13.1 (2012): 164.

HW3 - Biological Interactions (due Dec 8th)

Style 1: Literature-based (the focus is on concept understanding)

Part 1: Networks (choose one paper)

  1. Zhu, Xiaowei, Mark Gerstein, and Michael Snyder. "Getting connected: analysis and principles of biological networks." Genes & development 21.9 (2007): 1010-1024.
  2. Bonetta, Laura. "Protein-protein interactions: Interactome under construction." Nature 468.7325 (2010): 851-854.

Part 2: Forest of trees (choose one paper)

  1. Qi, Yanjun. "Random forest for bioinformatics." Ensemble machine learning. Springer US, 2012. 307-323.
  2. Pavlopoulos, Georgios A., et al. "Using graph theory to analyze biological networks." BioData mining 4.10 (2011): 1-27.

Style 2: Q&A-based

pdf HW3 part 1 - QandA: Gene Expression

pdf HW3 part 2 - QandA: RF and network inference(PDF version)

pdf HW3 part 2 - QandA: RF and network inference(Word version)

November 24, 2015 Class starts at 13.30h (Kirill)

pdf Lecture 10: Classic RF to uncover biological interactions

pdf Lecture 10: Classic RF to uncover biological interactions

Saved workspaces:



December 1, 2015 Class starts at 13.30h (Kirill)

This will be HW feedback and review class. I will comment on your HW1 and 2. Bring any questions with respect to HW to be discussed in class.

December 8, 2015 Class starts at 13.30h (Kirill, Kristel, Franck)

In class student's presentations of literature style HWs. Each presentation should be 30min max since we have many groups and 4hrs. You presentations should be submitted together with the HW3, even it presentation is of HW1 or HW2.

December 15 2015 Class starts at 13.30h (Kirill and Franck)

Franck and me (Kirill) will be available for questions from 13:30 until 14:30. Please fill out this doodle to estimate number of attendees. Doodle poll

December 18 2015 Office hours 9-10h (Kristel)

Prof. Kristel will be available at her office (B37 0.15 - look at building map on the ground floor) for your questions. Please complete the poll at here