Laboratory of Computer and Information Science > Teaching > T-122.102

Please sign in using Topi.

T-122.102 Informaatiotekniikan erikoiskurssi VI 3-4 ov L V
T-122.102 Special Course in Information Technology VI 3-4 cr P V

Spring 2004

Yhteisesiintymämenetelmät diskreetin datan analysoinnissa
Co-occurence methods in analysis of discrete data

Lecturer: PhD Kai Puolamäki, Prof. Jaakko Hollmén

Assistant: MSc Arto Klami


Time: In T4 on Tuesdays at 14-16 o'clock (from 20 January to 27 April 2004)

Requirements: To pass the course with 3 credits you have to

  • participate actively (read the articles before presentations, be present in the sessions, participate to the discussion),
  • give one lecture on a given topic (instructions at the bottom of the page) and
  • submit a summary article of your presentation, to be published in the course web site.

To pass the course with 4 credits you have to additionally

  • carry out a small research project on a given topic and write a brief report on the results.

Prerequisites: first two years' mathematics courses; the target audience of the course are graduate students and pre-graduate students interested in reasearch topics .

Language: English

Course material

Contents of the course

Topic of this research seminar is the recent advances in the analysis of discrete data sets, the common theme being co-occurences and how they are being dealt with. The emphasis of the course is on information theoretically (e.g. mutual information) or probabilistically motivated approaches (e.g. generative models). The related question is the definition distance or similarity measures in discrete data sets. The purpose of this course is to review various topics of the analysis of discrete data in this framework.

A preliminary sketch of topics:

  • "Oldies"
  • Information Bottleneck
  • Discriminative Clustering
  • Kernel methods
  • ICA motivated
  • LDA/mPCA
  • Discretizing data

Presentation instructions

A whole session is reserved for each presentation. It means that the presentations should be roughly 45 minutes long and the remaining time is used for discussion and possible organizational issues. It is recommended that the slides of the presentation are given as handouts at the beginning of the session. Please also send the slides to the course assistant in PostScript or PDF format so that they can be published in the web.

In addition to the presentation, you have to prepare a summary article of the topic. The summaries are published on the web page, and they are supposed to be quite short (a few pages) and easy to read. The purpose is that other people can understand the basic idea and the most important aspects of the presented topic without having to take the effort of reading the original article(s) thoroughly. Clever illustrations and examples are highly preferable.

The summaries (in postscript or PDF format) are to be submitted to the course assistant (Arto Klami) within 12 days after the presentation. Because the deadline is after the actual presentation, you can (and should) take the comments given at the presentation into account.

You are requested (but not required) to release your summary article under a Creative Commons License of your choosing. You can do this for example by including the following text to your article:

This work is licensed under the Creative Commons Attribution-NonCommercial License. To view a copy of this license, visit or send a letter to Creative Commons, 559 Nathan Abbott Way, Stanford, California 94305, USA.
Friday, 17-Dec-2004 14:46:17 EET