next up previous contents
Next: Component scaling: Up: A Data Mining Tool Previous: A Data Mining Tool

Data sets


Like most algorithms, the SOM follows the tried-and-true Garbage-In-Garbage-Out (GIGO) principle. Since the SOM learns directly from the data it receives, the quality of the data is of primary importance. Similarly, the learning result can be affected considerably by preprocessing the data appropriately. ENTIRE offers a few basic data preprocessing methods: component scaling, histogram equalization and filtering. All methods are component-specific in that they only affect one component of the data set vectors at a time.

Juha Vesanto
Tue May 27 12:40:37 EET DST 1997