- One of the important matters to be considered in the design of a
MLP network is its capability to generalize. Generalization means that
the network does not give a good response only to the learning samples
but also to more general samples. Good generalization can be obtained
only if the number of the free parameters of the network is kept
reasonable. As a thumb rule, the number of training samples should be
at least five times the number of parameters. If there are less
training samples than parameters, the network easily overlearns - it
handles perfectly the training samples but gives arbitrary responses to
all the other samples.
A MLP is used for a classification task in which the samples are divided into five classes. The input vectors of the network consist of ten features and the size of the training set is 800. How many hidden units there can be at most according to the rule given above?

- Consider the steepest descent method,
, reproduced in formula
(4.120) and earlier in Chapter 3 (Haykin). How could you determine the
learning-rate parameter so that it minimizes the cost function
as much as possible?

- Suppose that we have in the interpolation problem described in
Section 5.3 (Haykin) more observation points than RBF basis
functions. Derive now the best approximative solution to the
interpolation problem in the least-squares error sense.

Jarkko Venna 2005-04-13