Courses in previous years: [ 2000 | 2001 | 2002 | 2003 | 2004 | 2005]
Instructor | D.Sc.(Tech.) Ville Könönen |
---|---|
Course format | Seminar course |
Credits (ECTS) | 6 |
Semester | Spring 2006 (during periods III and IV) |
Seminar sessions | On Tuesdays at 14-16 in lecture hall T4 in the computer science building Konemiehentie 2, Otaniemi, Espoo |
Language | English if at least one English speaking participant |
Web | http://www.cis.hut.fi/Opinnot/T-61.6020/ |
t616020@cis.hut.fi |
Reinforcement learning has attained lots of attention in recent years. Although reinforcement learning methods and procedures were earlier considered to be too ambitious and to lack a firm foundation, they have now been established as practical methods for solving task requiring decision making and planning activities. The key concept in reinforcement learning model is the agent that learns, by interacting with its environment, statistical properties of its environment in trial and error manner. In this seminar course, the focus is on principal reinforcement learning models and the mathematical theory behind them. Some extensions, for example systems with multiple simultaneous learners, are also studied along with relevant state-of-the-art applications of reinforcement learning.
The course consists of seminar sessions and a small project work. The grading scheme for the course is failed–passed. Passing the course requires active participation for seminars (at least 70%) and the accepted project work.
The course material consists of two main text and several research articles:
Short descriptions of the material can be found in the following table:
Material | A short description |
---|---|
Chapter 3 in [1] | Basic definitions and ideas behind modern Reinforcement Learning (RL) methods. |
Chapter 4 in [1] | Describes the connection between RL and a certain dynamic programming task. |
Chapter 5 in [1] | Monte Carlo methods for solving Markov Decision Processes. |
Chapter 6 in [1] | Temporal Difference methods for solving Markov Decision Processes. |
4.1.,4.3., and 5.6. in [2] | Stochastic gradient methods and their convergence with Q-learning as an example. Quite theoretical material. However, going through all details is not necessary; the goal is to understand the connection between the theory and RL methods. |
[3] | Tutorial of Partially Observable Markov Decision Processes (POMDPs). |
[4] | An example system using POMDPs. |
Chapter 3 in [5] | A basic introduction to Game Theory. |
[6] | The first application of Markov Games to multiagent RL. |
[7] | Generalization of [6] to the general-sum problems. |
[8] | Sequential equilibrium approach to general-sum learning problems. |
[9] | Elevator control application. |
[10] | RL approach to the game of Go. |
[11] | NeuroChess Chess player. |
[12] | The backgammon player TD-Gammon. |
In this course, we have seven seminar sessions. Detailed timetable is as follows:
Date | Material | Presenter |
---|---|---|
24.1. | Introduction Lecture | Ville Könönen |
31.1. | Chapter 3 and 4 in [1] | Jaakko Väyrynen & Vibhor Kumar |
7.2. | Chapter 5 and 6 in [1] | Ville Viitaniemi & Paul Wagner |
14.2. | 4.1.,4.3., and 5.6. in [2] | Elif Özge Özdamar |
21.2. | [3] and [4] | Jarkko Salojärvi & Kaius Perttilä |
28.2. | Chapter 3 in [5] and [6] | Joni Pajarinen & Yongnan Ji |
14.3. | [7] and [10] | Lauri Lyly & Chen Shanzhen |
Each participant should carry out one small project work. There are three possibilies that can be found in the following links:
[project 1] [project 2] [project 3]The requirement for passing the project work is the written report (in English) that contains a description of the project; for example used tools, etc. In addition there are several questions in each project description. The final report should be mailed to the instructor. The deadline for the project work is 4.4.2006.
Seminar is mainly intended for graduate students. However, advanced undergraduate students having reasonable knowledge of statistical machine learning methods may also participate.
Welcome!
You are at: CIS → T-61.6020 Special Course in Computer and Information Science II
Page maintained by t616020@cis.hut.fi, last updated Tuesday, 14-Mar-2006 18:32:05 EET