It is interesting how powerful vector quantization can be. Since I like the quantization idea a lot, I think the following insight from a 2011 paper was notable: K-Means outperforms single layer neural nets for self taught learning. The task is to learn features of an image from small patches extracted in a sliding window. Features are extracted over these patches by:
1) Learning an auto encoder over these patches
We learn a neural net with the input layer
connected to a smaller hidden layer that connects
to an output layer. The input and output are the same
so we try to reconstruct the input using feature detectors
in the hidden layer.
2) Learning a restricted Boltzmann machine
Probabilistic version using a Markov random field of the above method.
No output layer is needed since the model is undirected. Learning can be
performed using Gibbs sampling.
3) K-Means:Vector quantization. Features are soft assignments to cluster centers.
4) Gaussian Mixtures: Fitting a gaussian mixture using Expectation Maximization. The features are
the posterior.
These features are used for classification using a Support Vector Machine. Interestingly enough, k-means outperforms all other methods by at least three to four percent accuracy on the CIFAR and NORB data set.
However, for a lot of the bigger task more recent results suggest that deep convolutional neural nets outperform everyone else.