The main goal of the work is to implement a widely used classification method and test it with fairly simple data. The work is carried out with Matlab software. In this work, the data consist of images of fruits. As the finding of good features, parameter values, learning set sizes etc. is rather time consuming, it is recommended that you do the computer runs, for example, in night time. To do that, use Unix command at:
>>at 02:15  
use -q matlab; /bin/nice -n10 matlab < exp.m > out.txt
With the previous command, Matlab will start at 02:15 and perform the tasks defined in exp.m file. The standard output of the Matlab run will be directed into out.txt file.


As the main purpose of the work is not to learn how to use Matlab but to get more familiar with pattern recognition methods, some of the Matlab code is provided. Your task is to code the recognizer.


The profile images of the fruits are stored in Matlab file /p/edu/tik-61.231/Data/fruits.mat as matrices. The images are of varying size and shape and the fruits are pictured in different angles and positions. You can have a look on some of the fruit images at here.

Matlab functions

You can perform the feature extraction with the following functions: cent2, pca2, and plen2. The first function calculates the center of mass for a two dimensional image. The second function returns principal components of the image. The third function calculates the lengths from a given point to the boundaries of the image. The first length is calculated in the direction of the first base vector. The vector base is given as an argument of the function. The change in the directions of the following lengths is a constant also given as an argument.

You can get the functions from here: cent2.m, pca2.m, and plen2.m.


Feature selection

You can use the lengths from an inside point to the boundaries of the image as features.

Normalization methods

If you calculate the lengths from the mass center of the image you can avoid the effects of translation. In order to get features insensitive to rotational variations, calculate the lengths in the directions defined by the principal component axes of the image and sort them in increasing order. These features will be invariant to the size variations if the feature vectors are scaled so that sum of the features is a constant. What is a sufficient number of features?

Recognition methods

Implement one of the following recognition methods: 1) the k Nearest Neighbors (knn) classifier, 2) Learning Vector Quantization (LVQ) classifier, 3) Bayesian classifier, or 4) Multi-layer perceptron (MLP). In the first two cases, use Euclidean metric. Experiment with different values of k and prototype set sizes. Use linearly decaying learning coefficient for LVQ. In case of Bayesian classifier, assume that the probability distributions are Gaussian. In case MLP, use functions from Matlab Neural Network Toolbox, keep the size of the network reasonable, and start experimenting by using default settings.

Divide the samples into two set and use one set for training and the other one for testing the recognition system.


In the report, give a brief description of the recognition method applied, justify all the selection you have made, and give the recognition results of the best recognizer (calculate the total error rate and occurrences of all error types). Also, include your Matlab code with comments.
Wednesday, 18-Mar-2009 22:50:35 EET