SOMine Lite 2.1

survey performed by Juha Ikonen, June 29th 1998

SOMine Lite is a data mining tool for various applications of Self-Organising Maps (or Kohonen Networks) such as marketing, finance, industry, economics and science. It has many useful features which support analysis of non-linear dependencies, parameter-free clustering, association and prediction, non-linear regression, pattern recognition and animated system states monitoring.

The program implements a variant of SOM, the Kohonen's Batch-SOM with a scaling technique which is not presented in documentation we had in hand at the time of survey. Since the resulting maps are in good agreement with maps created with the basic SOM, the employed algorithm appears to be correct.

SOMine Lite succeeds in hiding complex technology from the user, knowledge on neural networks or the SOM algorithm is not needed. Together with easy-to-use graphical user interface this makes the program a good practical tool for visualising complex data.


Disclaimer: If any information on this page simply is not true, please tell us about it and we'll correct it ASAP.

Disclaimer: The opinions and observation herein should be considered personal of the person having performed by the survey, at the time of the survey. They do not reflect any official standing of his employer, of the Laboratory of Computer and Information Science or the Neural Networks Research Center.


General

Program name Viscovery SOMine Lite 2.1
Availability Commercial, demo version is available from the website at http://www.eudaptics.com/
Pricing: Single license $1,495, non-commercial license $695.
Company information:
Eudaptics Software GmbH
Hauptstrasse 99
A-4232 Hagenberg, Austria
Tel: +43 7236 3343 388
Fax: +43 7236 3769
Purpose A practical tool for advanced analysis and monitoring of numerical data sets.
Operating system Windows 95, Windows NT 4.0
User interface Graphical user interface
Good in general
Good regarding the SOM
Documentation Online help
Good regarding user interface and program usage
Mediocre in technical/scientific details

[General comments]


SOM features

map parameters
Teaching algorithm Batch-SOM with growing map.
Implementation seems correct when compared to maps created with the basic SOM.
Map size Two-dimensional map grid with minimum of 9 and maximum of 20000 nodes.
Map lattice and shape Nodes are arranged on a hexagonal grid, map shape is rectangular.
The ratio of the two axii can be set by user or the software can derive it automatically from the data set.
Neighborhood function Function type: bubble (probably) Neighborhood size (h): N/A Learning rate (alpha): N/A
Initialization Data sample
Distance function Euclidian (probably)
Unknown components Allowed
Teaching length Explicit, depends on selected training schedule
efficiency
Speed [Windows NT 4.0, 200 MHz Pentium MMX, 128 MB RAM] For 3000 samples of 13-dimension data, 13 epochs:
1 minute 18 seconds with default training settings.
Results Normal results.
Final average quantization error 0.02232

Usability

preprocessing
Input formats Text files, Microsoft Excel 5.0/95 files, Windows Clipboard
Data handling Program provides good features for data handling:
  • Histogram of a selected component can be viewed
  • With logarithmic or sigmoid transformation user can influence the density characteristics of a component's distribution.
  • Each component of the data set is scaled separately, two components can be linked to apply the same scaling factor. There are two alternatives: scaling by variance and scaling by range.
  • Components can be weighted by a priority factor.
Data selection Data can be selected by means of amplifying or suppressing certain ranges of component values. Also by setting a priority factor to zero a component can be omitted from training process.
postprocessing
Output formats
  • Map graphics can be saved in Windows Metafile or bitmap formats.
  • Selected nodes can be saved in text format or copied to clipboard.
  • Selected path among map nodes can be saved in text format or copied to clipboard.
Map measures For quantization error, frequency and map curvature both views and numerical values.
Labelling Advanced labelling: labels can be inserted by typing, by importing from an external text file or pasted from clipboard.
Clustering Automatic clustering. User can set cluster threshold value and minimum cluster size in nodes. A clustering significance view helps in finding proper parameters for clustering. Clusters are visualised by shading and/or by separating lines.
visualization
Inspection of neurons Advanced: component values, frequency, quantization error and curvature measures can be inspected. Also statistical figures are provided for a cluster, neighbourhood of a node and a range of nodes. K nearest neighbours can be viewed for different values of K.
Clusters/map shape U-matrix and clusters. Contours of similarity between adjacent nodes can be viewed by shading and/or separating lines. Map curvature can be viewed.
Correlations By visualisation: component planes can be viewed in separate windows.
Data projections
  • An external data set can be evaluated statistically with respect to a map clustering. The results are stored in a text file, there are no visual methods for inspecting the results.
  • Process monitoring feature plots a trajectory of best matching units (BMU) of data vectors from an external data set.
  • An external data set can be presented to a map and a set of BMU vector values are written to a text file.
Markers Labels
Teaching interface is user-friendly. During training a graph shows the quantizing error and the normalised distortion of the process.



http://www.cis.hut.fi/projects/somtoolbox/links/somine.shtml
somtlbx@mail.cis.hut.fi
Monday, 09-Oct-2000 12:53:09 EEST