General
- Q
- What is SOM Toolbox ?
- A
-
A software library for Matlab 5 implementing the
Self-Organizing Map (SOM) algorithm.
- Q
- What is Matlab 5 ?
- A
-
Version 5 of the popular scientific computing environment Matlab.
Check out their homepages from http://www.mathworks.com
- Q
- Who is the Toolbox meant for ?
- A
-
SOM Toolbox is meant for people interested in using the
Self-Organizing Map in their research/development projects. To use
it, some basic knowledge of the SOM algorithm is required to
understand what the results mean. The toolbox has been built
modular so that you can build your own functions on top of
it and/or modify the existing functions to suit your needs.
Nothing prevents its usage in education, either. We have
tried to make the Toolbox easy to use also for beginners
(e.g. functions som_doit.m and som_gui.m).
- Q
- How does this relate to the Neural Networks toolbox ?
- A
-
The Neural Networks toolbox tries to cover the whole field of neural
networks, of which the self-organizing maps is just a part.
The Neural Networks toolbox also contains an implementation of the
Self-Organizing Map algorithm. However the implementation is neither
flexible, efficient or really up to the state of the art. If you
want to use self-organizing maps, we recommend using SOM Toolbox
instead.
- Q
- What does the SOM Toolbox cost ?
- A
-
Nothing. It's free.
Note, however, that it's also copyrighted,
so you can use it but not sell it (or even parts of it) onward
as your own. If you would like to make a commercial product
partly based on the SOM Toolbox, contact us and we'll consider it.
Feel free to use it as part of any non-commercial product,
as long as you remember to keep our copyright notices intact.
- Q
- When will the next version come out ?
- A
-
Er... someday... perhaps. The current version is 1.0beta, and we
may someday release version 1.0, but probably not very
soon. We'll be correcting bugs and maybe adding some new functions,
but don't hold your breath waiting it to happen.
- Q
- What kind of support do you offer ?
- A
-
In this early phase, we'll try to fix any serious bugs you might find
and offer answers to some basic questions, but eventually we'll start
referring everyone to the FAQ and HOW-TO
lists. For the moment though, your questions and comments are
very welcome!
- Q
- What kind of environment do I need ?
- A
-
The SOM Toolbox will (or it should) run anywhere where Matlab version
5 runs. Unfortunately this rules out Windows 3.1 and DOS environments.
As the algorithms are pretty heavy and do not spare memory, we recommend
as fast processor and as much memory as you can get. A 486 processor
is sufficient, although a...bit...slow. Anyway, try it out and
see for yourself.
- Q
- I have this algorithm that would be a great addition to the Toolbox.
What do I do ?
- A
-
You have? Great! We have a separate contrib area in the Toolbox
reserved for just this kind of contributions. You can retain your
own copyright when you contribute something. To contribute, just
send your algorithm along with possible documentation and copyright
notices to
somtlbx@mail.cis.hut.fi.
Handling data
- Q
- What should I do with categorial data ?
- A
-
Categorial data is something that needs special tricks, because
euclidian norm isn't applicable as a "distance" measure between
categories.
One solution is to use the one-of-C scheme: assuming
you have C categories, add C components to the data vectors so that
each of the new components corresponds to one category. For each data
vector assign value 1 to the component corresponding to its category
and let all other "category-components" be zero.
- Q
- Which normalization method should I use ?
- A
-
Of course it depends on your data and what you consider as
important. By default the 'som_var_norm' is used, which scales
each component to unit variance and zero mean. This is used to
make sure each component has approximately equal influence to
training. If it is important to separate value ranges with a lot
of samples with more precision than value ranges with only a
few samples, use 'som_hist_norm'.
Map training
- Q
- What kind of training parameters should I use ?
- A
-
If you are not sure what kind of parameters to use, trust the
defaults. If you want to optimize, try varying the parameters: try
both initialization algorithms, train longer, try a few other
neighborhood widths (both initial and final), and the 'inv' learning
rate type.
Visualization
- Q
- How come visualization doesn't work for higher than 2-dim map grids ?
- A
-
Well, that kind of visualization might be done using slices. But it
wouldn't be terribly illustrative, so we decided to leave that out.
Later on, we might add some kind of visualization tools for
3D map grids, but don't count on it.