How many layer in neural network are best for Image classification

“Very simple, just keep adding layers until the test error no longer improves.”

A method recommended by Geoff Hinton is to add layers until you start to overfit on your training set. Then you add dropout or another regularization method.

The best number of hidden units depends on a complex way of:

  • The numbers of input and output units.
  • number of training cases
  • the amount of noise in the objectives
  • The complexity of the function or classification to be learned.
  • the architecture
  • the type of activation function of the hidden unit
  • the training algorithm
  • regularization

In most situations, there is no way to determine the best number of hidden layers without formation of several networks and estimation of generalization the error of each one. If you have very few hidden layers, you will get a high training error and high error of generalization due to failures and high statistics bias. If you have too many hidden layers, you may get a low training error, but they still have a high generalization error due to overfitting and high variation.