Laboratory of Computer and Information Science / Neural Networks Research Centre CIS Lab Helsinki University of Technology

T-61.5050 High-throughput bioinformatics V P, 5/7 cr

(Title in Finnish: T-61.5050 Suurkapasiteettimittausten bioinformatiikka V L)

Lecturers D.Sc.(Tech) Janne Nikkil, Laboratory of Computer and Information Science,
Doc. Petri Auvinen, Institute of Biotechnology,
D.Sc. Alvis Brazma, European Bioinformatics Institute, Microarray group
Assistant M.Sc. Merja Oja, Laboratory of Computer and Information Science
Credits 5 or 7
Semester Spring 2007 (during periods III and IV)
Lectures On Thursdays at 14-16 in room T3 in the computer science building,
Konemiehentie 2, Otaniemi, Espoo.
The first lecture on 18.1.2007.
Exercises On Thursdays at 16-18 in room A328.
Language English

Introduction [back to top]

This year the topic is data analysis for gene expression.

Microarray technology has made it possible to monitor large-scale gene expression (the level of activation of genes) and has become incredibly popular in genomics research. This high-throughput technique can provide information for thousands of genes in parallel and is producing huge amount of valuable data. Data sets can easily have tens, hundreds of thousands or even millions of data points. This necessitates use of sophisticated data-analysis tools for processing and data mining of this type of genomic data, to understand the underlying genetic networks and to answer the complex biological and medical questions involved

This course is designed to introduce computational and statistical concepts and tools necessary to analyse microarray-based gene expression data, a skill that is in high demand by biotechnology, bioinformatics and pharmaceutical companies. The skills learned in this course will also be applicable to other problems involving large data sets, such as proteomics, and more generally in data mining.

Prerequisites for attending [back to top]

This course is intended mainly for advanced undergraduate (Master's level) and doctoral students and of computer science, statistics, and applied mathematics, but students from other fields are welcome as well. In particular mathematically oriented biology, bioinformatics, and medical students should benefit from the course.

Basic knowledge of probability, statistics, vector algebra, and calculus is assumed (the basic mathematics courses in HUT). A "Basic course in bioinformatics", such as S-114.2510 Computational Systems Biology or equivalent background is assumed as well.

Course format [back to top]

The course contains the following parts: lectures, exercise sessions, homework assignments, an intensive course (Tue 10-12, Thu 14-16 and Fri 14-16, between 13.2.-1.3.) and an exam.

Signing up for the course [back to top]

HUT students: In Webtopi
Other universities: Send email to the organizers or sign up at the introduction lecture.

Course material [back to top]

The lectures will be based on the book: Sorin Draghici, Data analysis tools for DNA microarrays (2003)

The lecture slides are available only for course participants. The lecture_notes/ directory is password protected. Course participants have received an e-mail about this. If you do not yet have the password please send e-mail to the course assistant.

List of material covered in the course can be found from here.

Some links to tutorials on how to use R can be found from:

Homework assignment [back to top]

The description of the homework assignment is here. The dead line for returning the exercise is the 27th of May. The dead line is strict. The report and the code used in the assignment should be returned by e-mail (

The data set for the assignment can be retrived from here. Note that the data directory is password protected. The password is the same as the one that can be used to acces the lecture notes. It has been e-mailed to course participants.

Course feedback [back to top]

A course feedback form for this course can be found through links from the page You should fill in the form by May 23rd.

Course schedule [back to top]

Time Lecturer Subject
Thu 18.1. Petri, Janne, Merja
  • Introduction + administrative issues [Slides 4/page (password protected)]
  • Basics of biology (Draghici: Ch 1 + a lot of extra material)
Thu 25.1. Petri, Janne
  • Lecture: Microarrays (Draghici: Ch 2 and 3 + a lot of extra material)
Thu 1.2. Janne, Merja
Thu 9.2. Janne, Merja
Tue 13.2.- Thu 1.3. Alvis Brazma
15.3. Janne, Merja
22.3. Janne, Merja
29.3. Janne, Merja
12.4. Janne, Merja
  • Lecture: Classification of HT data. [Slides 4/page (password protected)]
  • No exercise, because last lecture was postoned.
19.4. Janne, Merja
26.4. Janne, Merja
  • Summary lecture (recap of course contents)
  • Exercise: Demo of Mzmine, a tool for processing mass spectometry data
3.5. Janne, Merja
  • Lecture replaced by a special recap exercise session. Come and ask about the exercises!
  • 14-16 Exercise, 16-18 nothing

For more information, please send email to

Valid HTML 4.01 Transitional Valid CSS!

You are at: CIS → T-61.5050 High-throughput bioinformatics

Page maintained by, last updated Friday, 21-Dec-2007 12:40:29 EET