ResNet actual combat: tensorflow2 Version x, ResNet50 image classification task (small dataset)

abstract

This example extracts part of the data in the plant seedling data set as the data set. The data set has 12 categories. Today, I will work with you to implement tensorflow2 For the X version image classification task, the classification model uses ResNet50.

Through this article, you can learn:

1. How to load picture data and process data.

2. If the tag is converted to onehot encoding

3. How to use data enhancement.

4. How to use mixup.

5. How to segment data sets.

6. How to load the pre training model.

train

1,Mixup

mixup is an unconventional data enhancement method, a simple data enhancement principle independent of data. It constructs new training samples and labels by linear interpolation. The final treatment of the tag is shown in the following formula, which is very simple, but it is very unusual for the enhancement strategy.

( x i , y i ) \left ( x_{i},y_{i} \right ) (xi​,yi​), ( x j , y j ) \left ( x_{j},y_{j} \right ) (xj, yj) two data pairs are training sample pairs (training samples and their corresponding labels) in the original data set. among λ \lambda λ Is a parameter subject to B distribution, λ ∼ B e t a ( α , α ) \lambda\sim Beta\left ( \alpha ,\alpha \right ) λ ∼Beta( α,α) . The probability density function of beta distribution is shown in the figure below, where α ∈ [ 0 , + ∞ ] \alpha \in \left [ 0,+\infty \right ] α∈[0,+∞]

therefore α \alpha α Is a super parameter, with α \alpha α With the increase of, the training error of the network will increase, and its generalization ability will be enhanced. And when α → ∞ \alpha \rightarrow \infty α When →∞, the model will degenerate into the most primitive training strategy. reference resources: https://www.jianshu.com/p/d22fcd86f36d

Create a new mixupgenerator Py, insert the following code:

import numpy as np


class MixupGenerator():
    def __init__(self, X_train, y_train, batch_size=32, alpha=0.2, shuffle=True, datagen=None):
        self.X_train = X_train
        self.y_train = y_train
        self.batch_size = batch_size
        self.alpha = alpha
        self.shuffle = shuffle
        self.sample_num = len(X_train)
        self.datagen = datagen

    def __call__(self):
        while True:
            indexes = self.__get_exploration_order()
            itr_num = int(len(indexes) // (self.batch_size * 2))

            for i in range(itr_num):
                batch_ids = indexes[i * self.batch_size * 2:(i + 1) * self.batch_size * 2]
                X, y = self.__data_generation(batch_ids)

                yield X, y

    def __get_exploration_order(self):
        indexes = np.arange(self.sample_num)

        if self.shuffle:
            np.random.shuffle(indexes)

        return indexes

    def __data_generation(self, batch_ids):
        _, h, w, c = self.X_train.shape
        l = np.random.beta(self.alpha, self.alpha, self.batch_size)
        X_l = l.reshape(self.batch_size, 1, 1, 1)
        y_l = l.reshape(self.batch_size, 1)

        X1 = self.X_train[batch_ids[:self.batch_size]]
        X2 = self.X_train[batch_ids[self.batch_size:]]
        X = X1 * X_l + X2 * (1 - X_l)

        if self.datagen:
            for i in range(self.batch_size):
                X[i] = self.datagen.random_transform(X[i])
                X[i] = self.datagen.standardize(X[i])

        if isinstance(self.y_train, list):
            y = []

            for y_train_ in self.y_train:
                y1 = y_train_[batch_ids[:self.batch_size]]
                y2 = y_train_[batch_ids[self.batch_size:]]
                y.append(y1 * y_l + y2 * (1 - y_l))
        else:
            y1 = self.y_train[batch_ids[:self.batch_size]]
            y2 = self.y_train[batch_ids[self.batch_size:]]
            y = y1 * y_l + y2 * (1 - y_l)

        return X, y

2. Import the required packets and set global parameters

import numpy as np
from tensorflow.keras.optimizers import Adam
import numpy as np
from tensorflow.keras.optimizers import Adam
import cv2
from tensorflow.keras.preprocessing.image import img_to_array
from sklearn.model_selection import train_test_split
from tensorflow.python.keras.callbacks import ModelCheckpoint, ReduceLROnPlateau
from tensorflow.keras.applications.resnet import ResNet50
import os

from tensorflow.python.keras.utils import np_utils
from tensorflow.python.keras.layers import Dense
from tensorflow.python.keras.models import Sequential

from mixup_generator import MixupGenerator

norm_size = 224
datapath = 'data/train'
EPOCHS = 20
INIT_LR = 1e-3
labelList = []
dicClass = {'Black-grass': 0, 'Charlock': 1, 'Cleavers': 2, 'Common Chickweed': 3, 'Common wheat': 4, 'Fat Hen': 5, 'Loose Silky-bent': 6,
            'Maize': 7, 'Scentless Mayweed': 8, 'Shepherds Purse': 9, 'Small-flowered Cranesbill': 10, 'Sugar beet': 11}
classnum = 12
batch_size = 4

Here you can see tensorflow 2 Versions above 0 integrate keras. We don't need to install keras separately when using it. The previous code is upgraded to tensorflow2 For versions above 0, add tensorflow in front of keras.

Now that tensorflow is finished, let's explain some important global parameters:

  • norm_size = 224, ResNet50 the default picture size is 224 × 224.

  • datapath = 'data/train' set the path to store pictures. Here, it should be explained that if there are many pictures, they must not be placed in the project directory, otherwise pychart will browse all pictures when loading the project, which is very slow.

  • Epochs = the number of 100 epochs. The problem of how appropriate the setting of epochs is is very tangled. Generally, 300 is enough. If you feel that it is not well trained, load the model for training.

  • INIT_LR = 1e-3 learning rate. Generally, it gradually decreases from 0.001, and it should not be too small. It can be as low as 1e-6.

  • Classnum = number of 12 categories. There are 12 categories in the dataset. All of them define 12 categories.

  • batch_size = 4, batchsize. According to the hardware and the size of the data set, it is too small, the loss float is too large, and the convergence is not good. According to experience, it is generally set to the power of 2. windows can view the occupation of video memory through task manager.

    Ubuntu can use NVIDIA SMI to check the occupation of video memory.

3. Load picture

To process an image:

  1. Read image
  2. Resizes the image with the specified size.
  3. Convert image to array
  4. image normalization
  5. Label onehot

See code for details:

def loadImageData():
    imageList = []
    listClasses = os.listdir(datapath)# Category folder
    print(listClasses)
    for class_name in listClasses:
        label_id = dicClass[class_name]
        class_path=os.path.join(datapath,class_name)
        image_names=os.listdir(class_path)
        for image_name in image_names:
            image_full_path = os.path.join(class_path, image_name)
            labelList.append(label_id)
            image = cv2.imdecode(np.fromfile(image_full_path, dtype=np.uint8), -1)
            image = cv2.resize(image, (norm_size, norm_size), interpolation=cv2.INTER_LANCZOS4)
            if image.shape[2] >3:
                image=image[:,:,:3]
                print(image.shape)
            image = img_to_array(image)
            imageList.append(image)
    imageList = np.array(imageList) / 255.0
    return imageList


print("Start loading data")
imageArr = loadImageData()
print(type(imageArr))
labelList = np.array(labelList)
print("Loading data complete")
print(labelList)
labelList = np_utils.to_categorical(labelList, classnum)
print(labelList)

After making the data, we need to segment the training set and the test set, generally in the proportion of 4:1 or 7:3. Split dataset using train_ test_ The split () method needs to import from sklearn model_ selection import train_ test_ Split package. Example:

trainX, valX, trainY, valY = train_test_split(imageArr, labelList, test_size=0.2, random_state=42)

4. Image enhancement

ImageDataGenerator() is keras preprocessing. The image generator in the image module can also enhance the data in batch, expand the size of the data set and enhance the generalization ability of the model. Such as rotation, deformation, normalization and so on.

keras.preprocessing.image.ImageDataGenerator(featurewise_center=False,samplewise_center
=False, featurewise_std_normalization=False, samplewise_std_normalization=False,zca_whitening=False,
 zca_epsilon=1e-06, rotation_range=0.0, width_shift_range=0.0, height_shift_range=0.0,brightness_range=None, shear_range=0.0, zoom_range=0.0,channel_shift_range=0.0, fill_mode='nearest', cval=0.0, horizontal_flip=False, vertical_flip=False, rescale=None, preprocessing_function=None,data_format=None,validation_split=0.0)

Parameters:

  • featurewise_center: Boolean. Subtract the corresponding mean value of each channel from each channel of the input picture.
  • samplewise_center: Boolan. Subtract the sample mean from each picture so that each sample mean is 0.
  • featurewise_std_normalization(): Boolean()
  • samplewise_std_normalization(): Boolean()
  • zca_epsilon(): Default 12-6
  • zca_whitening: Boolean. Remove the correlation between samples
  • rotation_range(): rotation range
  • width_shift_range(): horizontal translation range
  • height_shift_range(): vertical translation range
  • shear_range(): float, range of perspective transformation
  • zoom_range(): zoom range
  • fill_mode: fill mode, constant, closest, reflect
  • cval: fill_ Fill in the value when mode = ='constant '
  • horizontal_flip(): horizontal reversal
  • vertical_flip(): flip vertically
  • preprocessing_ Function (): processing function provided by user
  • data_format(): channels_first or channels_last
  • validation_split(): how much data is used to validate the set

The image enhancement code used in this example is as follows:

from tensorflow.keras.preprocessing.image import ImageDataGenerator

train_datagen = ImageDataGenerator(
                                   rotation_range=20,
                                   width_shift_range=0.2,
                                   height_shift_range=0.2,
                                   horizontal_flip=True)
val_datagen = ImageDataGenerator()  # The verification set is not enhanced
training_generator_mix = MixupGenerator(trainX, trainY, batch_size=batch_size, alpha=0.2, datagen=train_datagen)()
val_generator = val_datagen.flow(valX, valY, batch_size=batch_size, shuffle=True)

5. Keep the best model and dynamically set the learning rate

Model checkpoint: used to save the best model.

The syntax is as follows:

keras.callbacks.ModelCheckpoint(filepath, monitor='val_loss', verbose=0, save_best_only=False, save_weights_only=False, mode='auto', period=1)

The callback function will save the model to filepath after each epoch

filepath can be a formatted string, and the placeholder in it will be passed in by the epoch value and on_epoch_end

For example, if filepath is weights {epoch:02d-{val_loss:.2f}}. HDF5, multiple files corresponding to epoch and verification set loss will be generated.

parameter

  • filename: string, the path to save the model
  • monitor: the value to be monitored
  • verbose: information display mode, 0 or 1
  • save_best_only: when set to True, only the best performing models on the validation set will be saved
  • Mode: one of 'auto', 'min' and 'Max', in save_ best_ When only = true, it determines the evaluation criteria of the best performance model, for example, when the monitoring value is val_acc, the mode should be max, when the detection value is val_ When loss, the mode should be min. In auto mode, the evaluation criteria are automatically inferred from the name of the monitored value.
  • save_weights_only: if it is set to True, only the model weight will be saved, otherwise the whole model (including model structure, configuration information, etc.) will be saved
  • period: the number of epoch s in the interval between checkpoints

Reducerlonplateau: when the evaluation index is not improved, reduce the learning rate. The syntax is as follows:

keras.callbacks.ReduceLROnPlateau(monitor='val_loss', factor=0.1, patience=10, verbose=0, mode='auto', epsilon=0.0001, cooldown=0, min_lr=0)

When learning stagnates, reducing the learning rate by 2 or 10 times can often achieve better results. This callback function detects the indicators. If the model performance is not improved in the patient epoch s, the learning rate will be reduced

parameter

  • monitor: monitored quantity
  • Factor: every time the learning rate is reduced, the learning rate will be reduced in the form of lr = lr*factor
  • patience: when an epoch passes and the performance of the model does not improve, the action of reducing the learning rate will be triggered
  • Mode: one of 'auto', 'min' and 'max'. In Min mode, if the detection value triggers the reduction of learning rate. In max mode, when the detection value no longer rises, the learning rate decreases.
  • epsilon: threshold, used to determine whether to enter the "plain area" of the detection value
  • Cooldown: after the learning rate decreases, the normal operation will be resumed after a cooldown epoch
  • min_lr: lower limit of learning rate

The code of this example is as follows:

checkpointer = ModelCheckpoint(filepath='weights_best_Deset_model.hdf5',
                               monitor='val_accuracy', verbose=1, save_best_only=True, mode='max')

reduce = ReduceLROnPlateau(monitor='val_accuracy', patience=10,
                           verbose=1,
                           factor=0.5,
                           min_lr=1e-6)

6. Modeling and training

model = Sequential()
model.add(ResNet50(include_top=False, pooling='avg', weights='imagenet'))
model.add(Dense(classnum, activation='softmax'))
model.summary()
optimizer = Adam(learning_rate=INIT_LR)
model.compile(optimizer=optimizer, loss='categorical_crossentropy', metrics=['accuracy'])

history = model.fit(training_generator_mix,
                              steps_per_epoch=trainX.shape[0] / batch_size,
                              validation_data=val_generator,
                              epochs=EPOCHS,
                              validation_steps=valX.shape[0] / batch_size,
                              callbacks=[checkpointer, reduce])
model.save('my_model.h5')
print(history)

Step 6 keep the training results and generate pictures

loss_trend_graph_path = r"WW_loss.jpg"
acc_trend_graph_path = r"WW_acc.jpg"
import matplotlib.pyplot as plt

print("Now,we start drawing the loss and acc trends graph...")
# summarize history for accuracy
fig = plt.figure(1)
plt.plot(history.history["accuracy"])
plt.plot(history.history["val_accuracy"])
plt.title("Model accuracy")
plt.ylabel("accuracy")
plt.xlabel("epoch")
plt.legend(["train", "test"], loc="upper left")
plt.savefig(acc_trend_graph_path)
plt.close(1)
# summarize history for loss
fig = plt.figure(2)
plt.plot(history.history["loss"])
plt.plot(history.history["val_loss"])
plt.title("Model loss")
plt.ylabel("loss")
plt.xlabel("epoch")
plt.legend(["train", "test"], loc="upper left")
plt.savefig(loss_trend_graph_path)
plt.close(2)
print("We are done, everything seems OK...")
# #windows system setting 10 shutdown
#os.system("shutdown -s -t 10")

result:

Test part

Single picture prediction

1. Import dependency

import cv2
import numpy as np
from tensorflow.keras.preprocessing.image import img_to_array
from  tensorflow.keras.models import load_model
import time

2. Set global parameters

Note here that the order of the dictionary is consistent with that of the training

norm_size=224
imagelist=[]
emotion_labels = {
    0: 'Black-grass',
    1: 'Charlock',
    2: 'Cleavers',
    3: 'Common Chickweed',
    4: 'Common wheat',
    5: 'Fat Hen',
    6: 'Loose Silky-bent',
    7: 'Maize',
    8: 'Scentless Mayweed',
    9: 'Shepherds Purse',
    10: 'Small-flowered Cranesbill',
    11: 'Sugar beet',
}

3. Loading model

emotion_classifier=load_model("best_model.hdf5")
t1=time.time()

4. Processing pictures

The logic of processing pictures is similar to that of training sets. The steps are as follows:

  • Read picture
  • resize the picture to norm_size × norm_size.
  • Convert the picture to an array.
  • Put it in the imagelist.
  • Divide the whole imagelist by 255 and scale the value to between 0 and 1.
image = cv2.imdecode(np.fromfile('data/test/0a64e3e6c.png', dtype=np.uint8), -1)
# load the image, pre-process it, and store it in the data list
image = cv2.resize(image, (norm_size, norm_size), interpolation=cv2.INTER_LANCZOS4)
image = img_to_array(image)
imagelist.append(image)
imageList = np.array(imagelist, dtype="float") / 255.0

5. Forecast category

Predict the category and get the index of the highest category.

out=emotion_classifier.predict(imageList)
print(out)
pre=np.argmax(out)
emotion = emotion_labels[pre]
t2=time.time()
print(emotion)
t3=t2-t1
print(t3)

Operation results:

[[1.7556800e-03 8.5450716e-07 1.9150861e-05 1.9705877e-07 9.9732012e-01
8.0649025e-04 2.5912817e-07 2.2540871e-06 8.6973196e-05 6.1359890e-07
4.1976641e-08 7.3218480e-06]]
Common wheat
3.50178861618042

Batch forecast

The difference between batch forecast and single forecast is mainly in reading data and processing of forecast categories after the forecast is completed. Nothing else has changed.

Steps:

  • Load the model.
  • Define the directory of the test set
  • Get pictures in the directory
  • Loop picture
    • Read picture
    • resize picture
    • Turn array
    • Put it in imageList
  • Zoom to 0 to 255
  • forecast
emotion_classifier=load_model("best_model.hdf5")
t1=time.time()
predict_dir = 'data/test'
test11 = os.listdir(predict_dir)
for file in test11:
    filepath=os.path.join(predict_dir,file)

    image = cv2.imdecode(np.fromfile(filepath, dtype=np.uint8), -1)
    # load the image, pre-process it, and store it in the data list
    image = cv2.resize(image, (norm_size, norm_size), interpolation=cv2.INTER_LANCZOS4)
    image = img_to_array(image)
    imagelist.append(image)
imageList = np.array(imagelist, dtype="float") / 255.0
out = emotion_classifier.predict(imageList)
print(out)
pre = [np.argmax(i) for i in out]

Operation results:

Keywords: Machine Learning Deep Learning

Added by The_Walrus on Sat, 22 Jan 2022 19:13:57 +0200