next up previous contents
Next: Futher development Up: A Data Mining Tool Previous: Data on map

Postprocessing

 

Data postprocessing is an important part of data analysis. The primary postprocessing tool in ENTIRE is labeling. The map can be labeled either automatically or manually. ENTIRE offers four labeling methods: autolabeling, labeling by unit selection, labeling by BMU search and labeling by component value range. A SOM where all these methods have been used is shown in figure 4.8.

The autolabeling method is based on having a labeled set of sample vectors. The BMUs of these vectors are searched and the units are labeled with the labels of the vectors. This can be done also the other way around: a data set can be autolabeled using a labeled SOM. Other labeling methods are interactive. In unit selection method the user manually selects a set of units and gives them a common label. In BMU search method a number of BMUs of a sample vector are searched and the units are given a common label. In component value range method the user selects a range of values for a certain component and gives all units having the weight vector component value in that range a common label.

Another postprocessing approach is to form new data sets from the data projections. For example a data set can be constructed of the BMU coordinates and quantization errors of each data vector in a data set. The new data sets can then be further analysed using some other method. This postprocessing method also forms the basis of operation of the hierarchical maps.



Juha Vesanto
Tue May 27 12:40:37 EET DST 1997