next up previous contents
Next: Visualization Up: Important properties Previous: Quantization and projection:

Errors in data:

Naturally, the SOM suffers from the presence of any kind of flaws in the data, but the degradation of performance is graceful. An outlier in the data only effects one map unit and its neighbors. The outlier is also easy to detect from the map, since its distance in the input space from other units is large.

An important property of the SOM is that it can be used even on partial data, or data with missing data component values. Only a simple change to the algorithm is needed: if the sample input vector on a specific training step has missing components, they are simply left out from the distance calculations and the updating procedure. The algorithm remains statistically valid unless the number of missing components in the vector is big.



Juha Vesanto
Tue May 27 12:40:37 EET DST 1997