The idea of using the Self-Organizing Map to estimate the distribution of faulty-free samples and then to classify an unknown sample as a defect if it differs enough from this estimated distribution, is not a new one [1]. The remaining question has been how much an unknown sample can differ from the best-matching unit of the map before it is classified as a defect. One solution is to determine a threshold value for the distance between a feature vector of the unknown sample and the best-matching map unit. It is not, however, easy to choose such a threshold value, and in fact a common threshold value for all map units may not be enough. Instead, a distinct threshold value for each map unit would be necessary.
The proposed method overcomes the threshold selection problem.
It makes use of the Voronoi set of each map unit and defines a new
rule for finding the best-matching map unit.
For each component plane j of the Voronoi set
a confidence interval is defined. When assuming a uniform
distribution for a component plane j, which is actually the best
possible assumption that can be made, the definition of the confidence
interval is straightforward. However, these distributions are not
necessary uniform. For example the Parzen estimation procedure could
be used to estimate these distributions if more accuracy were needed.
Forming the Voronoi tesselation of the feature space with other
estimator than the Self-Organizing Map could also be possible.
For a recent review of different estimators, see for example
[5].
There are two parameters in the proposed scheme that must be given by
hand, namely the confidence level d and the limit T.
The confidence level d should be near 100% so that the mean of the
minimum error would be zero (for faulty-free samples).
The determination of the limit T is quite straightforward.
First of all, the T
is an integer and it takes values between zero and n (n is the
dimension of feature vector). Secondly, the value of T depends on the
desired accuracy of the segmentation. For large values of T noise is
eliminated and only the most evident (or easily detectable) defects
are found. On the other hand, for small values of T noise is
increased but also the weak (or difficult) defects are found.
So the value of T is a compromise between accuracy and noise.
The results of experiments with base paper samples are encouraging. When compared with simple threshold segmentation based on gray level histograms, the proposed scheme has obvious advantages. The method is also general in the sense that it can be applied to fault detection of different types of surfaces. However, it may be necessary to reselect features to take into account the specific properties of the surface type.