Deep learning VGG16 network based on tensorflow 2.0

The network depth of VGG series is improved compared with its previous networks. VGG16 and VGG19 are the representatives of VGG series. This time, VGG16 network is realized based on tensorflow 2.0.

1. Introduction to vgg16 network

VGG16 network model stood out in the 2014 ImageNet competition, ranking second in classification tasks and first in positioning tasks. Compared with the previous LexNet and LeNet networks, VGG16 network reached an unprecedented level in the number of network layers at that time.

2. Network structure

3. Innovation

① Use 3x3 convolution kernel instead of 7x7 convolution kernel.

3x3 convolution kernel is the smallest receptive field size that can feel the focus of up, down, left and right. Moreover, when two 3x3 convolution kernels are superimposed, their receptive fields are equivalent to one 5x5 convolution kernel. After three convolution kernels are superimposed, their receptive fields are equivalent to one 7x7 effect.
Because the receptive fields are the same, three 3x3 convolutions use three nonlinear activation functions, which increases the nonlinear expression ability and makes the segmentation plane more separable. At the same time, small convolution kernel is used to greatly reduce the amount of parameters.
The use of 3x3 convolution kernel stack not only increases the number of network layers, but also reduces the amount of parameters.

② By increasing the number of channels to reach a deeper network, 2x2 pooled core and Max pooling method are used.

Using 2x2 pooled cores, small pooled cores can bring more detailed information capture. At that time, there was also average pooling, but the effect of max pooling was better in the image task. max was easier to capture the changes in the image, resulting in greater local information difference and better description of edge texture.

4. Network implementation

def VGG16(nb_class,input_shape):
    input_ten = Input(shape=input_shape)
    #1
    x = tf.keras.layers.Conv2D(filters=64,kernel_size=(3,3),activation='relu',padding='same')(input_ten)
    x = tf.keras.layers.Conv2D(filters=64,kernel_size=(3,3),activation='relu',padding='same')(x)
    x = tf.keras.layers.MaxPool2D(pool_size=(2,2),strides=(2,2))(x)
    #2
    x = tf.keras.layers.Conv2D(filters=128,kernel_size=(3,3),activation='relu',padding='same')(x)
    x = tf.keras.layers.Conv2D(filters=128,kernel_size=(3,3),activation='relu',padding='same')(x)
    x = tf.keras.layers.MaxPool2D(pool_size=(2,2),strides=(2,2))(x)
    #3
    x = tf.keras.layers.Conv2D(filters=256,kernel_size=(3,3),activation='relu',padding='same')(x)
    x = tf.keras.layers.Conv2D(filters=256,kernel_size=(3,3),activation='relu',padding='same')(x)
    x = tf.keras.layers.Conv2D(filters=256,kernel_size=(3,3),activation='relu',padding='same')(x)
    x = tf.keras.layers.MaxPool2D(pool_size=(2,2),strides=(2,2))(x)
    #4
    x = tf.keras.layers.Conv2D(filters=512,kernel_size=(3,3),activation='relu',padding='same')(x)
    x = tf.keras.layers.Conv2D(filters=512,kernel_size=(3,3),activation='relu',padding='same')(x)
    x = tf.keras.layers.Conv2D(filters=512,kernel_size=(3,3),activation='relu',padding='same')(x)
    x = tf.keras.layers.MaxPool2D(pool_size=(2,2),strides=(2,2))(x)
    #5
    x = tf.keras.layers.Conv2D(filters=512,kernel_size=(3,3),activation='relu',padding='same')(x)
    x = tf.keras.layers.Conv2D(filters=512,kernel_size=(3,3),activation='relu',padding='same')(x)
    x = tf.keras.layers.Conv2D(filters=512,kernel_size=(3,3),activation='relu',padding='same')(x)
    x = tf.keras.layers.MaxPool2D(pool_size=(2,2),strides=(2,2))(x)
    #FC
    x = tf.keras.layers.Flatten()(x)
    x = Dense(4096,activation='relu')(x)
    x = Dense(4096,activation='relu')(x)
    output_ten = Dense(nb_class,activation='softmax')(x)
    model = Model(input_ten,output_ten)
    return model
model_VGG16 = VGG16(24,(img_height,img_width,3))
model_VGG16.summary()
Model: "model"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         [(None, 224, 224, 3)]     0         
_________________________________________________________________
conv2d (Conv2D)              (None, 224, 224, 64)      1792      
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 224, 224, 64)      36928     
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 112, 112, 64)      0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 112, 112, 128)     73856     
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 112, 112, 128)     147584    
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 56, 56, 128)       0         
_________________________________________________________________
conv2d_4 (Conv2D)            (None, 56, 56, 256)       295168    
_________________________________________________________________
conv2d_5 (Conv2D)            (None, 56, 56, 256)       590080    
_________________________________________________________________
conv2d_6 (Conv2D)            (None, 56, 56, 256)       590080    
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 28, 28, 256)       0         
_________________________________________________________________
conv2d_7 (Conv2D)            (None, 28, 28, 512)       1180160   
_________________________________________________________________
conv2d_8 (Conv2D)            (None, 28, 28, 512)       2359808   
_________________________________________________________________
conv2d_9 (Conv2D)            (None, 28, 28, 512)       2359808   
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 14, 14, 512)       0         
_________________________________________________________________
conv2d_10 (Conv2D)           (None, 14, 14, 512)       2359808   
_________________________________________________________________
conv2d_11 (Conv2D)           (None, 14, 14, 512)       2359808   
_________________________________________________________________
conv2d_12 (Conv2D)           (None, 14, 14, 512)       2359808   
_________________________________________________________________
max_pooling2d_4 (MaxPooling2 (None, 7, 7, 512)         0         
_________________________________________________________________
flatten (Flatten)            (None, 25088)             0         
_________________________________________________________________
dense (Dense)                (None, 4096)              102764544 
_________________________________________________________________
dense_1 (Dense)              (None, 4096)              16781312  
_________________________________________________________________
dense_2 (Dense)              (None, 24)                98328     
=================================================================
Total params: 134,358,872
Trainable params: 134,358,872
Non-trainable params: 0
_________________________________________________________________

It can be found that the training parameters of VGG16 have reached 134358872, many times more than that of AlexNet, but the network performance of VGG series is still relatively good.

Come on a

Keywords: Machine Learning neural networks Deep Learning

Added by aidema on Fri, 01 Oct 2021 22:52:04 +0300