Statistical structure of complex cell outputs: emergence of contour coding

Previous studies have shown how many basic properties of the primary visual cortex, such as the receptive fields of simple and complex cells and the spatial organization (topography) of the cells, can be understood as efficient coding of natural images. In this project we extend the framework by considering how the responses of complex cells (when fed with natural image input) could be sparsely represented by a higher-order neural layer.

We model complex-cell activations by a simple energy model consisting of quadrature gabor filters, squaring, and summing (this part of our model was fixed, not learned). This is depicted in the figure below:

We first sampled patches from a set of images of natural scenes. Then, we calculated our model complex cell responses to these patches. For simplicity of interpretation and for computational reasons, we restricted our analysis to a single spatial scale, and the cells were placed on a rectangular 6-by-6 grid with 4 differently oriented cells at each location. Three patches and their corresponding responses are shown below. (The ellipses show the orientation and approximate extent of the individual complex cells. The brightness of the different ellipses indicate the response strengths.)

Now, we model our data (complex-cell activations) by the linear sparse coding (ICA) model. In other words, we seek a representation:

_{=
s1+
s2+
... + sk}

such that the stochastic coefficients are sparse and independent. As our input data is non-negative, we require the same of our model (both basis vectors and coefficients). This is a modification of standard ICA.

A small subset (16/288) of the estimated basis patterns are shown below:

Each basis pattern corresponds to a higher-order unit that represents that particular kind of input. As can be readily seen, our higher-order units code for collinear active complex cells, essentially signalling the presence of part of a contour. The units have varying preferred lengths, with some coding the activation of only a single complex cell whereas others represent longer contours.

As an important part of the representation is a competition between units, the varying length preference leads to length-tuning: short units are end-stopped, whereas units coding longer contours (collator units) do not respond at all to short contours. This is shown in our paper (see below). In addition, the paper also describes how contour integration could be interpreted as top-down feedback in the presented model.

For more details, see our paper:

P.O. Hoyer and A. Hyvärinen. A multi-layer sparse coding network learns contour coding from natural images. Vision Research, in press.
Postscript gzipped PostScript

Patrik Hoyer & Aapo Hyvarinen
December 2001