Tik-61.261 Principles of Neural Computing
- In section 4.6 (part 5, Haykin pp. 181) it is mentioned that the
inputs should be normalized to accelerate the convergence of the
back-propagation learning process by preprocessing them as follows: 1)
their mean should be close to zero, 2) the input variables should be
uncorrelated, and 3) the covariances of the decorrelated inputs should
be approximately equal.
- Devise a method based
on principal component analysis performing these steps.
- Is the
proposed method unique?
- A continuous function can be approximated with
a step function in the closed interval
illustrated in Figure 1.
- Show how a single
column, that is of height in the interval
and zero elsewhere, can be constructed with a two-layer
MLP. Use two hidden units and the sign function as the activation
function. The activation function of the output unit is taken to be
- Design a two-layer MLP consisting of such simple
sub-networks which approximates function with a precision
determined by the width and the number of the columns.
- How does the
approximation change if tanh is used instead of sign as an activation
function in the hidden layer?
- A MLP is used for a classification task. The number of classes
is and the classes are denoted with
. Both the input vector
and the corresponding
class are random variables, and they are assumed to have a joint probability distribution
. Assume that we have so many training
samples that the back-propagation algorithm minimizes the following
is the actual response of the th output
neuron and is the desired response.
that the theoretical solution of the minimization problem is
- Show that if when
belongs to class and
otherwise, the theoretical solution can be written
which is the optimal solution in a Bayesian sense.
- Sometimes the number of the output neurons is chosen to be less than the
number of classes. The classes can be then coded with a binary code. For
example in the case of 8 classes and 3 output neurons, the desired
output for class is , for class
it is and so on. What is the theoretical solution
in such a case?
Function approximation with a step function.