HELSINKI UNIVERSITY OF TECHNOLOGY
	LABORATORY OF COMPUTER AND INFORMATION SCIENCE
	NEURAL NETWORKS RESEARCH CENTRE

Courses in previous years: [ 1998 | 2000 | 2002 | 2003 | 2004 ]

T-61.183 Special Course in Information Science III P V
T-61.183 Informaatiotekniikan erikoiskurssi III L V

Lecturers:	Prof. Timo Honkela, Dr.Tech. Mikko Kurimo, Dr.Tech. Jorma Laaksonen, Dr.Tech. Krista Lagus, PhD Kai Puolamäki
Semester:	Spring 2005
Credit points:	3-4
Place:	Lecture hall T4 in the computer science building
Time:	Mondays at 14-16 o'clock (24.1.2005-2.5.2005)
Language:	English
Homepage:	http://www.cis.hut.fi/Opinnot/T-61.183/

Multimodal Systems - Course description

Moving beyond the mainstream of just recognizing images, text, and speech to automatic understanding of their content makes a lot of new applications possible. "By bringing in data from multiple modalities and contexts that provide the various required internal conceptual dimensions, one may be able to extract automatically the correct perceptual information. Multimodal interfaces can be developed by fusing several of perceptual and user feedback modalities.

During the seminar, we will consider the following topics:

information retrieval from multimodal data
video content analysis
multimodal content description standard MPEG7
analysis of multimodal data including: speech, images, video, eye movements and gestures
grounding word meanings in multimodal contexts
multimodal data segmentation
multimodal speech recognition
multimodal person recognition / identification
multimodal interfaces

Potential material to be covered during the course includes:

Yian Li and C.-C. Jay Kuo (2003). Video Content Analysis Using Multimodal Information. Kluwer.
Lynne Duckley (2003). Multimedia Databases. Addison-Wesley.
Chen Yu, Dana H. Ballard and Richard N. Aslin (2003), The Role of Embodied Intention in Early Lexical Acquisition. Proceedings of the Twenty-Fifth Annual Meeting of Cognitive Science Society. Boston, MA.
Deb Roy (2004). Grounding Language in the World: Schema Theory Meets Semiotics.
Deb Roy (2003). Grounded Spoken Language Acquisition: Experiments in Word Learning. IEEE Transactions on Multimedia.
Proceedings of the IEEE Vol. 91, Issue 9, Sept. 2003. Special issue on human-computer multimodal interface.
(http://ieeexplore.ieee.org/xpl/tocresult.jsp?isNumber=27570&puNumber=5)
Contents:
- Interacting with computers by voice: automatic speech recognition and synthesis
- Recent advances in the automatic recognition of audiovisual speech
- Speech-gesture driven multimodal interfaces for crisis management
- Boosted learning in dynamic Bayesian networks for multimodal speaker detection
- Toward an affect-sensitive multimodal human-computer interaction
- Perceptive animated interfaces: first steps toward a new paradigm for human-computer interaction

To pass the course, you will need to:

participate sufficently in the seminar meetings,
give a talk,
solve a sufficient percentage of problems, and
perform given empirical and/or experimental assignment(s).

Emerging timetable

24 Jan 05:
- Introduction to multimodal systems. How human beings process multimodal information. (Honkela)
- Introducing the participants. Practical arrangements.
31 Jan 05:
- Different aspects of multimodal systems
  - Laaksonen
  - Kurimo (slides)
  - Lagus
  - Puolamäki (slides and potential assignment topics)
- Assigning papers.
7 Feb 05: no common session (potentially good time for the groups to meet!)
14 Feb 05:
- Group 3, Jaakko Väyrynen and Tiina Lindh-Knuutila: "Grounded Spoken Language Acquisition" (based on an article by Deb Roy) and Quan Zhon: "Grounding Language in the World: Signs, Schemas, and Meaning" (based on an article by Deb Roy)
21 Feb 05:
- Group 2 (Pöllä et al.): "Automatic Annotation of Images"
28 Feb 05:
- Group 1 (Pylkkönen et al.): "Video Analysis"
14 Mar 05:
- Group 3 (Lindh-Knuutila et Väyrynen): "Affect-sensitive multimodal human-computer interaction" (slides)
21 Mar 05:
- Group 1 (Kivinen et al.): "MPEG-7 multimedia content description standard"
- giving the homework problems
4 Apr 05:
- Group 2 (Yang et al.): topic to be agreed
11 Apr 05:
- returning homework problems
- giving homework prototype answers
18 Apr 05:
- giving homework prototype answers

T-61.183 Special Course in Information Science III P V T-61.183 Informaatiotekniikan erikoiskurssi III L V

Multimodal Systems - Course description

Emerging timetable

T-61.183 Special Course in Information Science III P V
T-61.183 Informaatiotekniikan erikoiskurssi III L V