exercise5

Tik-61.261 Principles of Neural Computing
Raivio, Venna

Exercise 5,

The McCulloch-Pitts perceptrons can be used to perform numerous logical tasks. Neurons are assumed to have two binary input signals, and , and a constant bias signal which are combined into an input vector as follows: , . The output of the neuron is given by

$\displaystyle y=\begin{cases}1 \text{, \, if } \mathbf{w}^T\mathbf{x}>0 \\ 0 \text{, \, if } \mathbf{w}^T\mathbf{x} \leq 0 \\ \end{cases}$

where is an adjustable weight vector. Demonstrate the implementation of the following binary logic functions with a single neuron:
2. not
3. or
4. and
5. nor
6. nand
7. xor .
What is the value of weight vector in each case?
A single perceptron is used for a classification task, and its weight vector $\mathbf{w}$ is updated iteratively in the following way:

$\displaystyle \mathbf{w}(n+1)=\mathbf{w}(n) + \alpha(y-y')\mathbf{x}$

where $\mathbf{x}$ is the input signal, sgn $(\mathbf{w}^T\mathbf{x})=\pm 1$ is the output of the neuron, and $y=\pm 1$ is the correct class. Parameter $\alpha$ is a positive learning rate. How does the weight vector $\mathbf{w}$ evolve from its initial value $\mathbf{w}(0)=[1,1]^T$ , when the above updating rule is applied with $\alpha=0.4$ , and we have the following samples from classes ${\cal{C}}_1$ and ${\cal{C}}_2$ :

$\displaystyle {\cal{C}}_1:$ $\displaystyle \{[2,1]^T\},$

$\displaystyle {\cal{C}}_2:$ $\displaystyle \{[0,1]^T, [-1,1]^T \}$
Suppose that in the signal-flow graph of the perceptron illustrated in Figure 1 the hard limiter is replaced by the sigmoidal linearity:

$\displaystyle \varphi(v)=\tanh(\frac{v}{2})$

where is the induced local field. The classification decisions made by the perceptron are defined as follows:

Observation vector $\mathbf{x}$ belongs to class ${\cal{C}}_1$ if the output $y>\theta$ where $\theta$ is a threshold;
otherwise, $\mathbf{x}$ belongs to class ${\cal{C}}_2$

Show that the decision boundary so constructed is a hyperplane.
Two pattern classes, ${\cal{C}}_1$ and ${\cal{C}}_2$ , are assumed to have Gaussian distributions which are centered around points $\mu_1=[-2,-2]^T$ and $\mu_2=[2,2]^T$ and have the following covariance matrixes:

$\displaystyle \Sigma_1=\begin{bmatrix}\alpha & 0 \\ 0 & 1\end{bmatrix}$ and $\displaystyle \Sigma_2=\begin{bmatrix}3 & 0 \\ 0 & 1\end{bmatrix}.$

Plot the distributions and determine the optimal Bayesian decision surface for $\alpha=3$ and $\alpha=1$ . In both cases, assume that the prior probabilities of the classes are equal, the costs associated with correct classifications are zero, and the costs associated with misclassifications are equal.