Explore Courses Blog Tutorials Interview Questions
+1 vote
in Machine Learning by (4.2k points)

After reading a few papers on deep learning and deep belief networks, I got a basic idea of how it works. But still stuck with the last step, i.e, the classification step. Most of the implementation I found on the Internet deal with generation. (MNIST digits)

Is there some explanation (or code) available somewhere that talk about classifying images(preferably natural images or objects) using DBNs?

Also, some pointers in the direction would be really helpful.

1 Answer

+1 vote
by (6.8k points)

The explanation

These days, the progressive deep learning for image classification issues (e.g. ImageNet) are sometimes "deep convolutional neural networks" (Deep ConvNets). They look roughly like this ConvNet configuration by Krizhevsky et al:

For the inference (classification), you feed an image into the left side (notice that the depth on the left side is 3, for RGB), crunch through a series of convolution filters, and it spits out a 1000-dimensional vector on the right-hand facet. This image is very for ImageNet, which focuses on classifying 1000 categories of images, so the 1000d vector is "score of how likely it is that this image fits in the category."

Training the neural net is merely slightly additional complicated. For training, you basically run classification repeatedly, and every so often you do backpropagation (see Andrew Ng's lectures) to improve the convolution filters in the network. Basically, backpropagation asks "what did the network classify correctly/incorrectly? For misclassified stuff, let's fix the network a little bit."


Caffe may be an in no time open-source implementation (faster than Cuda-convent from Krizhevsky et al) of deep convolutional neural networks. The Caffe code is pretty simple to read; there is essentially one C++ file per sort of network layer (e.g. convolutional layers, max-pooling layers, etc).

For more details, check Deep learning with TensorFlow.

Browse Categories