Principal curves.

Next: Other methods. Up: Nonlinear projection methods Previous: Multidimensional scaling

Principal curves.

PCA can be generalized to form nonlinear curves. While in PCA a good projection of a data set onto a linear manifold was constructed, the goal in constructing a principal curve is to project the set onto a nonlinear manifold. The principal curves [Hastie and Stuetzle, 1989] are smooth curves that are defined by the property that each point of the curve is the average of all data points that project to it, i.e., for which that point is the closest point on the curve. Intuitively speaking, the curves pass through the ``center'' of the data set. Principal curves are generalizations of principal components extracted using PCA in the sense that a linear principal curve is a principal component; the connections between the two methods are delineated more carefully in the original article. Although the extracted structures are called principal curves the generalization to surfaces seems relatively straightforward, although the resulting algorithms will become computationally more intensive.

The conception of continuous principal curves may aid in understanding how principal components could be sensibly generalized. To be useful in practical computations, however, the curves must be discretized. It has turned out [Mulier and Cherkassky, 1995, Ritter et al., 1992] that discretized principal curves are essentially equivalent to SOMs, introduced before Hastie and Stuetzle (1989) introduced the principal curves. It thus seems that the conception of principal curves is most useful in providing one possible viewpoint to the properties of the SOM algorithm.

Sami Kaski
Mon Mar 31 23:43:35 EET DST 1997