Laboratory of Computer and Information Science / Neural Networks Research Centre CIS Lab Helsinki University of Technology
[an error occurred while processing this directive]

T-61.6080 Special course in Bioinformatics II:
Prior knowledge and background data in computational inference V P, 5 cr

Notices (updated 16.10.2007)
  • The starting time for the sessions was changed to 12:30
  • Jaakko Peltonen will give a talk on multitask learning on 18.10.

Lecturer D.Sc.(Tech.) Janne Nikkilä, Laboratory of Computer and Information Science
Assistant M.Sc.(Tech.) Leo Lahti, Laboratory of Computer and Information Science
Credits (ECTS) 5
Semester Autumn 2007, period I and II
Seminar sessions Thursdays 12.30-14.00 in lecture hall T5 in computer science building,
Konemiehentie 2, Otaniemi, Espoo. The introductory session on September 20th.
Language English
Registration TKK students: WebTopi, others: send mail to

Introduction [back to top]

In computational modeling tasks some background data or prior knowledge is usually available in addition to the actual data concerning the primary task. This background information could be useful in analysing new experiments. The issue is highlighted in bioinformatics, where a single biological experiment always contains very few samples compared to the complexity of the system that generates the data (cell and organisms), and at the same time public databases in internet become more and more populated by knowledge and data from analogous experiments. The main problem typically is that the background data is not directly related to the current task, or only part of it is known to be relevant: in bioinformatics the data is from slightly different experiments, organisms, measurement procedures etc. This setting requires advanced computational methods that are able to utilize expert knowledge and/or learn the relevance from the data.

This course is designed to introduce computational and statistical concepts and tools used at the moment in utilizing background data and prior knowledge, especially in bioinformatics. The course reviews techniques, for example, from relevant subtask learning, use of prior information by Bayesian methods, and applications of supervised learning.

Prerequisites [back to top]

This course is intended mainly for graduate students of computer science, statistics, and applied mathematics, but students from other fields are welcome as well. In particular mathematically oriented biology, bioinformatics, and medical students should benefit from the course.

Basic knowledge of probability, statistics, vector algebra, and calculus is assumed (the basic mathematics courses in HUT). A "Basic course in bioinformatics", such as S-114.2510 Computational Systems Biology or equivalent background is assumed as well.

Course format [back to top]

Journal club (abstract + presentation + project work) (5 cr).

Requirements for passing the course [back to top]

The course is graded as fail/pass/pass with distinction. The abstracts+presentations and the project works are graded as fail/pass/pass with distinction. To pass the course, both presentation+abstract and project work must have at least pass grade.

Make a presentation about a subject chosen during the first sessions. Prepare a one page extended abstract about your subject and send it electronically to the assistant at least 48 hours before your presentation. The assistant will then comment the abstract if needed, and you will send the corrected abstract to the assistant at least 24 hours before your presentation, and the assistant will send it to the other participants of the course.

Complete the project work and return a concise written report about it.

Attend the seminar session and the discussion actively.

Signing up for the course [back to top]

TKK students: In Webtopi
Other universities: Send email to the organizers or sign up at the introduction lecture.

Literature/Course material [back to top]

A collection of articles (still improving all the time).

Introductory material (read before the course):

Schedule [back to top]

We aim at having to seminar presentations at each meeting. Time reserved for each presentation is 25 min followed by 5 min for discussion.
Time Lecturer Subject and material
20.9. Janne Nikkilä, Leo Lahti
  • Administrative issues
  • Introduction
4.10. Andrey Ermolov Integrative missing value estimation for microarray data.
Hu et al., BMC Bioinformatics, 2006. (html)
11.10. Lauri Lyly Exploration, normalization, and genotype calls of high-density oligonucleotide SNP array data.
Carvalho et al., Biostatistics, 2007. (html)
18.10 Jaakko Peltonen Multitask learning
25.10 Andrey Ermolov Penalized Probabilistic Clustering.
Z. Lu and T.K. Leen, Neural Computation, 2007. (html)
1.11 Lauri Lyly Logistic regression with an auxiliary data source.
Liao et al., ICML 2005. (pdf)

Project work [back to top]

Own topic or one of the topics suggested in the last seminar meeting.

Write a short (A4) project plan before the deadline and send it to the course assistant. Our suggestions and approval will be made as agreed in the course.

DL for the project will be decided later. For more details, see the separate page.

For more information, please send email to

Janne Nikkilä and Leo Lahti

You are at: CIS → T-61.6080 Special course in bioinformatics II

Page maintained by, last updated Tuesday, 19-Aug-2008 10:51:04 EEST