Assignment: Independent component analysis (ICA)
You must return a written report in which you provide justified answers
to the questions in the assignment. Explain what you have done and why.
The complete, sufficiently commented source code should be included as
an appendix in the report. The report can be returned by email
in PDF or PostScript format to the assistant.
This computer assignment starts with getting acquainted in practice
to the popular Matlab software package FastICA. It has been developed
by A. Hyvärinen et al. in HUT, Lab. of Computer and Information Science,
for estimating the basic linear ICA and BSS model. The main goal of this
computer assignment is to apply FastICA to the extraction of features from
image data consisting of natural scenes using ICA.
- Get a package of data and code from below,
extract it, start Matlab, move to the directory contained in the package and
run init to set up your environment.
-
- Experiment with FastICA using artificially generated data.
- Generate 200 data points from 2-4 independent sources, and
mix them linearly. Sources with temporal structure make
visualization of the results easier. You can try to use for example a sine
wave or a sawtooth wave, as well as white noise with different
distributions.
- Apply FastICA to the mixtures and inspect whether the original sources
are separated or not.
- Try sources with different distributions and observe how they
affect the performance of the algorithm.
- Try changing some of the parameters of the algorithm, such as
symmetric vs. deflation approach, different nonlinearities, etc.
- ICA can also be used for feature extraction by seeking for most
independent features of a given data set (see Chapter 21 in the course book).
We experiment with
this using digital images as the data. The following outline shows what you
should do. You can experiment with different parameters such as the
size of the patch, number of patches, different preprocessing
methods and parameters of FastICA, including the number of independent
components to extract.
- Load the set of natural images (images/nat*.tif) using the
provided function load_images.
- Sample patches of the images using sample_patches.
- Convert the patches to a set of vectors for use with FastICA.
You may view (some of) the patches using plot_columns.
- Apply different kinds of preprocessing to the data
(e.g. removing the local mean might be a good idea).
- Apply FastICA to the preprocessed data and visualize the
resulting basis images of the mixing matrix using
plot_columns.
- Study the recovered sources. What do their distributions look
like? Are the sources really independent?
- Convolve some of the features (columns of the mixing matrix) with one
of the full images. What do the features represent?
-
- Repeat the experiment using a set of images of buildings
(images/bui*.tif). Is there any difference in the
estimated features?
Material
Data and Codes (tar.gz, zip)
Useful Matlab commands
help
plot
rand, randn
reshape
hist
mean
conv2
imagesc
Useful FastICA options
Option: Values:
g 'pow3', 'tanh', 'gauss', 'skew'
approach 'symm', 'defl'
lastEig
stabilization
Page maintained by t615130@cis.hut.fi,
last updated Friday, 05-Feb-2010 18:19:20 EET