In Deep Neural Networks the depth refers to how deep the network is but in this context, the depth is used for visual recognition and it translates to the 3rd dimension of an image.
In this case you have an image, and the size of this input is 32x32x3 which is (width, height, depth)
. The neural network should be able to learn based on this parameters as depth translates to the different channels of the training images.
UPDATE:
In each layer of your CNN it learns regularities about training images. In the very first layers, the regularities are curves and edges, then when you go deeper along the layers you start learning higher levels of regularities such as colors, shapes, objects etc. This is the basic idea, but there lots of technical details. Before going any further give this a shot : http://www.datarobot.com/blog/a-primer-on-deep-learning/
UPDATE 2:
Have a look at the first figure in the link you provided. It says ‘In this example, the red input layer holds the image, so its width and height would be the dimensions of the image, and the depth would be 3 (Red, Green, Blue channels).’ It means that a ConvNet neuron transforms the input image by arranging its neurons in three dimeonsion.
As an answer to your question, depth corresponds to the different color channels of an image.
Moreover, about the filter depth. The tutorial states this.
Every filter is small spatially (along width and height), but extends through the full depth of the input volume.
Which basically means that a filter is a smaller part of an image that moves around the depth of the image in order to learn the regularities in the image.
UPDATE 3:
For the real world example I just browsed the original paper and it says this : The first convolutional layer filters the 224×224×3 input image with 96 kernels of size 11×11×3 with a stride of 4 pixels.
In the tutorial it refers the depth as the channel, but in real world you can design whatever dimension you like. After all that is your design
The tutorial aims to give you a glimpse of how ConvNets work in theory, but if I design a ConvNet nobody can stop me proposing one with a different depth.
Does this make any sense?