Laboratory of Computer and Information Science / Neural Networks Research Centre CIS Lab Helsinki University of Technology

Courses in previous years: [ 2004 ]

Replaces the course:
T-61.186 Special course in Language Technology V P 2-5 cr
T-61.186 Kieliteknologian erikoiskurssi V L 2-5 ov

T-61.6090 Special course in Language Technology V P
T-61.6090 Kieliteknologian erikoiskurssi V L

Statistical Machine Translation, Autumn 2005

Organizer Timo Honkela
Assistants Tiina Lindh-Knuutila, Jaakko Väyrynen
Researchers Krista Lagus, Ann Russell
Credit points 7
Semester Autumn 2005 and to some extent spring 2006 (during periods I, II and III)
Place virtual discussions in the Fle3 environment,
actual meetings at the computer science building, Konemiehentie 2, Otaniemi, Espoo
Time weekly discussions with new topics in the Fle3 environment
Language English

Machine Translation (MT) refers to automatic translation from one natural language to another. To this date, even state-of-the-art systems are not able to provide high quality translations, except in some limited domains, such as weather reports etc. Moreover, the human effort needed to build a traditional MT system based on rules, transformations, or some kinds of knowledge representation formalisms is also substantial.

The availability of monolingual and parallel corpora enables the use of statistical methods for machine translation. A parallel corpus contains the same text in different languages. Parallel corpora are interesting because of the opportunity offered to align the original and the translated text and thus gain insights into the nature of translation. However, useful information for MT can be also gained by analyzing statistically monolingual corpora.

Significant improvements to translation system performance may be achieved using statistical approaches. Potential benefits of the statistical approaches include:

The following list outlines some topics that will or may be covered during the seminar. Some topics will be covered in more detail and some other themes will emerge during the seminar.

The seminar is namely organized as a knowledge building or progressive inquiry learning activity. In order to facilitate this process and to provide a possibility to attend to the course even for those in other universities than in Helsinki area, a web-based learning environment will be used. The course is thus given as a virtual course by using a specific web-based learning environment (Fle3) on the Internet. The course is organized in a workshop kind of fashion: The participants will form groups and participate to discussions over the internet. Each group will work on their selected topic, present their work orally in a final session, and write a scientific article.

There will be two meetings where physical attendance is required: At the beginning of the course there will be an introduction lecture and a guest lecture on the theme by Prof. Lauri Carlson. In addition, information on the basic arrangements and on how to use the learning environment will be given. At the end of the course, intermediate project works are presented orally in a full-day seminar.

Important dates:

12 September 2005 Introduction lecture + guest lecture by Prof. Lauri Carlson
14-16, room T4
15 December 2005 Deadline for draft papers
23 January 2006 Oral presentations + guest lecture
28 February 2006 Deadline for final papers

Year 2005 2006
Month September October November December January February March
Period I II III
Week 3637383940414243444546474849505152 12345678910
Actual meetings
Virtual meethings
Introduction lecture
Group formation
Research plan
Group research
Draft article DL
Oral presentations
Guest lecture
Article DL

This special course is intended mainly for graduate students, however, advanced undergraduate students may also be admitted (please contact the course organizers). Each participant is expected to have reasonably advanced understanding of some discipline neighboring the domain of interest (machine translation, statistical methods, cognitive science, language philosophy).

Students, e.g., from the following disciplines and areas are welcome:

Successful participation does not require specific mathematical modeling or programming skills. However, basic familiarity with computational and/or mathematical modeling is useful.

The course will be graded as accepted / failed. Requirements for passing the course:

HUT students: In Webtopi
Other universities: Send email to the organizers

Or sign up at the introduction lecture.

Some possibly relevant literature can be found from this .bib file. There is a collection of links for Machine Translation on a separate link page.

For more information, please send email to

Timo Honkela, Tiina Lindh-Knuutila, Jaakko Väyrynen

