LeNet - 5
Yann LeCun
- Mainly used for digit classification
- We take an image of 32 x 32 x 1.
- Back then people used average pooling.
- The last conv layer has 400 parameters.
- As we go deeper in the network $n_H$ and $n_W$decrease while $n_C$ increases.
- Conv → Pool → Conv → Pool → FC → FC → Output
- They used sigmoid and tanh and not ReLU.
- Also, it used the activation layer after Average pooling layer.
AlexNet
Alex Krizhevsky, Ilya Sutskevar, Geoffrey Hinton
- Image size of 227 x 227 x 3.
- This paper applied Max Pooling.
- The last Conv layer has 9216 parameters.
- Used ReLU.
- Multiple GPUs were used
- LeNet-5 had 60k parameters, AlexNet had 60M parameters.