VGG

last modified : 16-07-2019

General Information (main fields described, non-exhaustive list)

  • Title: Very Deep Convolutional Networks for Large-Scale Image Recognition
  • Authors: Karen Simonyan, Andrew Zisserman
  • Link: article
  • Date of first submission: 4 September 2014
  • Implementations: Built-in for most of the deep-learning framework

Brief

This article tests the importance of depth for neural networks. To do so, they train multiple neural networks, slowly increasing the depth of each of the networks.

How Does It Work

There are multiple networks trained in the article, all of the architecture are describe in the figure below. The networks are series of convolutions and pooling followed by fully connected layers.

Network

Results

The table below describes all the results obtained on the ILSVRC challenge.

Method top-1 val. error (\%) top-5 val. error (\%) top-5 test error (\%)
VGG (2 nets, multi-crop | dense eval.) 23.7 6.8 6.8
GoogLeNet (Szegedy et al., 2014) (1 net) - 7.9 7.9
GoogLeNet (Szegedy et al., 2014) (7 nets) - 6.7 6.7
MSRA (He et al., 2014) (11 nets) - - 8.1
MSRA (He et al., 2014) (1 net) 27.9 9.1 9.1
Clarifai (Russakovsky et al., 2014) (multiple nets) - - 11.7
Clarifai (Russakovsky et al., 2014) (1 net) - - 12.5

And the table bleow describe the results obtained for the different architectures described in the section "How Does It Works". This table shows that increasing the depth of the network increase the precision of the results.

ConvNet config. smallest image side smallest image side test top-1 val. error (\%) top-5 val. error (\%)
B 256 224,256,288 28.2 9.6
C 256 224,256,288 27.7 9.2
C 384 352,384,416 27.8 9.2
C [256; 512] 256,384,512 26.3 8.2
D 256 224,256,288 26.6 8.6
D 384 352,384,416 26.5 8.6
D [ 256; 512] 256,384,512 24.8 7.5
E 256 224,256,288 26.9 8.7
E 384 352,384,416 26.7 8.6
E [256; 512] 256,384,512 24.8 7.5

In Depth

Nothing special, go check the article for details on the data-augmentation used or the experiments they did.