[an error occurred while processing this directive]

T-61.6080 Special course in Bioinformatics II:
Prior knowledge and background data in computational inference V P, 5 cr

Notices (updated 16.10.2007)

The starting time for the sessions was changed to 12:30
Jaakko Peltonen will give a talk on multitask learning on 18.10.

Lecturer	D.Sc.(Tech.) Janne Nikkilä, Laboratory of Computer and Information Science
Assistant	M.Sc.(Tech.) Leo Lahti, Laboratory of Computer and Information Science
Credits (ECTS)	5
Semester	Autumn 2007, period I and II
Seminar sessions	Thursdays 12.30-14.00 in lecture hall T5 in computer science building, Konemiehentie 2, Otaniemi, Espoo. The introductory session on September 20th.
Language	English
Web	http://www.cis.hut.fi/Opinnot/T-61.6080/
Registration	TKK students: WebTopi, others: send mail to t616080@cis.hut.fi
E-mail	t616080@cis.hut.fi

Introduction
Prerequisites for attending
Course format
Requirements for passing the course
Signing up for the course
Literature
Schedule
Project work
One page summary/brochure

Introduction [back to top]

In computational modeling tasks some background data or prior knowledge is usually available in addition to the actual data concerning the primary task. This background information could be useful in analysing new experiments. The issue is highlighted in bioinformatics, where a single biological experiment always contains very few samples compared to the complexity of the system that generates the data (cell and organisms), and at the same time public databases in internet become more and more populated by knowledge and data from analogous experiments. The main problem typically is that the background data is not directly related to the current task, or only part of it is known to be relevant: in bioinformatics the data is from slightly different experiments, organisms, measurement procedures etc. This setting requires advanced computational methods that are able to utilize expert knowledge and/or learn the relevance from the data.

This course is designed to introduce computational and statistical concepts and tools used at the moment in utilizing background data and prior knowledge, especially in bioinformatics. The course reviews techniques, for example, from relevant subtask learning, use of prior information by Bayesian methods, and applications of supervised learning.

Prerequisites [back to top]

This course is intended mainly for graduate students of computer science, statistics, and applied mathematics, but students from other fields are welcome as well. In particular mathematically oriented biology, bioinformatics, and medical students should benefit from the course.

Basic knowledge of probability, statistics, vector algebra, and calculus is assumed (the basic mathematics courses in HUT). A "Basic course in bioinformatics", such as S-114.2510 Computational Systems Biology or equivalent background is assumed as well.

Course format [back to top]

Journal club (abstract + presentation + project work) (5 cr).

Requirements for passing the course [back to top]

The course is graded as fail/pass/pass with distinction. The abstracts+presentations and the project works are graded as fail/pass/pass with distinction. To pass the course, both presentation+abstract and project work must have at least pass grade.

Make a presentation about a subject chosen during the first sessions. Prepare a one page extended abstract about your subject and send it electronically to the assistant at least 48 hours before your presentation. The assistant will then comment the abstract if needed, and you will send the corrected abstract to the assistant at least 24 hours before your presentation, and the assistant will send it to the other participants of the course.

Complete the project work and return a concise written report about it.

Attend the seminar session and the discussion actively.

Signing up for the course [back to top]

TKK students:	In Webtopi
Other universities:	Send email to the organizers or sign up at the introduction lecture.

Literature/Course material [back to top]

A collection of articles (still improving all the time).

Introductory material (read before the course):

Good tutorial for those with no biological background:
Molecular Biology for Computer Scientists
L. Hunter in AI and molecular biology
Web tutorial to bioinformatics in general

Schedule [back to top]

We aim at having to seminar presentations at each meeting. Time reserved for each presentation is 25 min followed by 5 min for discussion.

Time	Lecturer	Subject and material
20.9.	Janne Nikkilä, Leo Lahti	Administrative issues Introduction
4.10.	Andrey Ermolov	Integrative missing value estimation for microarray data. Hu et al., BMC Bioinformatics, 2006. (html)
11.10.	Lauri Lyly	Exploration, normalization, and genotype calls of high-density oligonucleotide SNP array data. Carvalho et al., Biostatistics, 2007. (html)
18.10	Jaakko Peltonen	Multitask learning
25.10	Andrey Ermolov	Penalized Probabilistic Clustering. Z. Lu and T.K. Leen, Neural Computation, 2007. (html)
1.11	Lauri Lyly	Logistic regression with an auxiliary data source. Liao et al., ICML 2005. (pdf)
8.11

Project work [back to top]

Own topic or one of the topics suggested in the last seminar meeting.

Write a short (A4) project plan before the deadline and send it to the course assistant. Our suggestions and approval will be made as agreed in the course.

DL for the project will be decided later. For more details, see the separate page.

For more information, please send email to t616080@cis.hut.fi.

Welcome,
Janne Nikkilä and Leo Lahti