Implementation of the whole process of concise cat image classification based on PaddleX

Propeller tool assembly - PaddleX
The whole process development tool of the propeller integrates all the capabilities required for in-depth learning and development such as the core framework, model library, tools and components of the propeller, and opens up the whole process of in-depth learning and development.
PaddleX also provides a concise Python API and a graphical development client for one click download and installation. Users can choose the corresponding development mode according to the actual production needs to obtain the best experience of the whole process development of the propeller.

Competition link: New year competition of flying oar pilot group in the opening season - twelve cat classification problems
AI learning map : the answer score reaches the baseline value (70) and wins a big gift (100 computing power and 100 points)- Question 1: Twelve cat classifications

Data description

This competition requires contestants to classify 12 kinds of cats, which belongs to the classic image classification task in CV direction. As the cornerstone of other image tasks, image classification task can make everyone get started with computer vision faster. The competition data set contains pictures of 12 kinds of cats and is divided into training set and test set.

Training set: provide high-definition color pictures and the classification of the pictures. There are 2160 pictures of cats, including annotation files.
Test set: only color pictures are provided, with a total of 240 pictures of cats, excluding label files.

1 import dependency

!pip install paddlex

import warnings
warnings.filterwarnings('ignore')

import os
os.environ['CUDA_VISIBLE_DEVICES'] = '0'

import paddle
import paddlex as pdx
from paddlex import transforms as T

import numpy as np
import pandas as pd
import shutil
import glob

import cv2
import imghdr
from PIL import Image

2 data cleaning

Generate ImageNet format folder, and the target data format is as follows:

Dataset/ # Image classification dataset root directory
|--class A/  # All pictures in the current folder belong to category A
|  |--a_1.jpg
|  |--a_2.jpg
|  |--...
|  |--...
|
|--...
|
|--class Z/ # All pictures in the current folder belong to category Z
|  |--z1.jpg
|  |--z2.jpg
|  |--...
|  |--...

Because we have a train in our training center_ list. Txt stores the corresponding category information, so we first put all the pictures in the corresponding category folder, and then use PaddleX to automatically divide the data.

2.1 decompressing data sets

pdx.utils.decompress('data/data10954/cat_12_train.zip')
pdx.utils.decompress('data/data10954/cat_12_test.zip')

Generate 12 category folders.

for i in range(12):
    cls_path = os.path.join('data/data10954/ImageNetDataset/', '%02d' % int(i))
    if not os.path.exists(cls_path):
        os.makedirs(cls_path)

Why use 00 / 01 / As a category? Because the PaddleX partition is sorted according to the string, the 2 / 3 /. After the partition The numerical number of is after 10 / 11.
We hope that the category number output by the model is consistent with our folder (i.e. the number submitted by the competition), so it is set to XX format.

2.2 abnormal format cleaning

Generate a one-to-one correspondence between the file name and the category, and then put the pictures into the target folder according to the category cls: data / data10954 / imagenetdataset / * / * jpg.

train_df = pd.read_csv('data/data10954/train_list.txt', header=None, sep='\t')
train_df.columns = ['name', 'cls']
train_df['name'] = train_df['name'].apply(lambda x: str(x).strip().split('/')[-1])
train_df['cls'] = train_df['cls'].apply(lambda x: '%02d' % int(str(x).strip()))
train_df.head()

	name	cls
0	8GOkTtqw7E6IHZx4olYnhzvXLCiRsUfM.jpg	00
1	hwQDH3VBabeFXISfjlWEmYicoyr6qK1p.jpg	00
2	RDgZKvM6sp3Tx9dlqiLNEVJjmcfQ0zI4.jpg	00
3	ArBRzHyphTxFS2be9XLaU58m34PudlEf.jpg	00
4	kmW7GTX6uyM2A53NBZxibYRpQnIVatCH.jpg	00

The model input picture format should be RGB three channel, if imghdr If what does not recognize the picture format, it will be deleted.

for i in range(len(train_df)):
    img_path = os.path.join('data/data10954/cat_12_train', train_df.at[i, 'name'])

    if os.path.exists(img_path) and imghdr.what(img_path):
        img = Image.open(img_path)
        if img.mode != 'RGB':
            img = img.convert('RGB')
            img.save(img_path)
    else:
        os.remove(img_path)
        print('delete:', img_path)

delete: data/data10954/cat_12_train/ieOvwupZbC4Xckj73znWxo0ARMKD5FrP.jpg
delete: data/data10954/cat_12_train/ovY2atRg8fsZ4jTbKC0UJIOd7mlPEy9u.jpg

SRC from source path_ Move path to destination path dst_path.
Note: in the previous step, only the exception picture was deleted, but the DataFrame was not updated. Therefore, use try except to ignore the situation where the record in the DataFrame still exists but the picture does not exist.

for i in range(len(train_df)):
    src_path = os.path.join(
        'data/data10954/cat_12_train',
        train_df.at[i, 'name'])

    dst_path = os.path.join(
        os.path.join(
            'data/data10954/ImageNetDataset/',
            train_df.at[i, 'cls']),
        train_df.at[i, 'name'])

    try:
        shutil.move(src_path, dst_path)
    except Exception as e:
        print(e)

[Errno 2] No such file or directory: 'data/data10954/cat_12_train/ieOvwupZbC4Xckj73znWxo0ARMKD5FrP.jpg'
[Errno 2] No such file or directory: 'data/data10954/cat_12_train/ovY2atRg8fsZ4jTbKC0UJIOd7mlPEy9u.jpg'

3 data division

PaddleX - Data Partitioning : 85% (1836) training + 15% (322) verification.

!paddlex --split_dataset --format ImageNet\
    --dataset_dir data/data10954/ImageNetDataset\
    --val_value 0.15\
    --test_value 0

4 data transformation and reading

PaddleX - Data Transformation : the data transformation method also has an important impact on the time-consuming and final effect of training.

train_transforms = T.Compose([
    T.MixupImage(mixup_epoch=115),
    T.ResizeByShort(short_size=256),
    T.RandomCrop(crop_size=224, aspect_ratio=[0.75, 1.25], scaling=[0.3, 1.0]),
    T.RandomHorizontalFlip(0.5),
    T.RandomDistort(
        brightness_range=0.4, brightness_prob=0.5,
        contrast_range=0.4, contrast_prob=0.5,
        saturation_range=0.4, saturation_prob=0.5,
        hue_range=18, hue_prob=0.5),
    T.RandomBlur(0.05),
    T.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])

eval_transforms = T.Compose([
    T.ResizeByShort(short_size=256),
    T.CenterCrop(crop_size=224),
    T.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])

PaddleX - data reader : the reader for training and verification during model building training.

train_dataset = pdx.datasets.ImageNet(
    data_dir='data/data10954/ImageNetDataset',
    file_list='data/data10954/ImageNetDataset/train_list.txt',
    label_list='data/data10954/ImageNetDataset/labels.txt',
    transforms=train_transforms,
    shuffle=True)

eval_dataset = pdx.datasets.ImageNet(
    data_dir='data/data10954/ImageNetDataset',
    file_list='data/data10954/ImageNetDataset/val_list.txt',
    label_list='data/data10954/ImageNetDataset/labels.txt',
    transforms=eval_transforms)

5 configuration and training

PaddleX - image classification model : select here ResNet101_vd_ssld , its TOP1 accuracy on ImageNet dataset reaches 0.8373.

model = pdx.cls.ResNet101_vd_ssld(num_classes=len(train_dataset.labels))

5.1 learning rate strategy

The learning rate strategy and parameters adopt the classic IMAGENET training method:

LinearWarmup: warm up the learning rate by 5 epochs (warm up_steps = step_each_Epoch * 5);
Piecewsedecay: the periodic learning rate decreases, and the learning rate of every 30 epochs decreases to 0.1 times of the original, a total of 120 epochs (learning_rate * (0.1**i));
The Momentum optimizer with Momentum is selected, and the weight L2 regularization coefficient is 0.0005;
The batch size and initial learning rate are linearly scaled from 256-0.1 to 64-0.025. Considering the small size of the data set and the use of pre training weight, they are reduced to 64-0.0125.

You can enter in the notebook? paddle. optimizer. lr. Piecweisedecay knows how to use this block, for example? paddle. optimizer. lr. Cosine annealingdecay cosine attenuation.

num_epochs = 120
learning_rate = 0.0125
lr_decay_epochs = [30, 60, 90]
train_batch_size = 64
step_each_epoch = train_dataset.num_samples // train_batch_size

boundaries = [b * step_each_epoch for b in lr_decay_epochs]
values = [learning_rate * (0.1**i) for i in range(len(lr_decay_epochs) + 1)]
lr = paddle.optimizer.lr.PiecewiseDecay(
    boundaries=boundaries,
    values=values)

lr = paddle.optimizer.lr.LinearWarmup(
    learning_rate=lr,
    warmup_steps=step_each_epoch * 5,
    start_lr=0.0,
    end_lr=learning_rate)

optimizer = paddle.optimizer.Momentum(
    learning_rate=lr,
    momentum=0.9,
    weight_decay=paddle.regularizer.L2Decay(0.0005),
    parameters=model.net.parameters())

5.2 start training

You can enter Watch - N 0 NVIDIA SMI in the terminal to view the memory capacity, and adjust the size of picture transformation, batch size and other parameters as required.
Capacity of this version: 9354MiB / 16384MiB.
Because the division proportion of validation data is small and Mixup Data enhancement makes the high index model on the early verification set less robust than that at the end of training.

model.train(
    train_dataset=train_dataset,
    eval_dataset=eval_dataset,

    num_epochs=num_epochs,
    train_batch_size=train_batch_size,
    optimizer=optimizer,

    save_interval_epochs=1,
    log_interval_steps=step_each_epoch * 5,

    pretrain_weights='IMAGENET',
    save_dir='output/ResNet101_vd_ssld',
    use_vdl=True)

6 evaluation and prediction

6.1 model evaluation

PaddleX - load model

model = pdx.load_model('output/ResNet101_vd_ssld/best_model')

Different data evaluation transformations will also lead to different results, such as the original transformation Eval below_ transforms_ Origin and modified transform eval_transforms_modify, the indicators are a little different.

eval_transforms_origin = T.Compose([
    T.ResizeByShort(short_size=256),
    T.CenterCrop(crop_size=224),
    T.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])

eval_transforms_modify = T.Compose([
    T.Resize(target_size=224),
    T.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])

eval_dataset = pdx.datasets.ImageNet(
    data_dir='data/data10954/ImageNetDataset',
    file_list='data/data10954/ImageNetDataset/val_list.txt',
    label_list='data/data10954/ImageNetDataset/labels.txt',
    transforms=eval_transforms_origin)

model.evaluate(eval_dataset=eval_dataset, batch_size=64)

OrderedDict([('acc1', 0.9713542), ('acc5', 1.0)])

eval_dataset = pdx.datasets.ImageNet(
    data_dir='data/data10954/ImageNetDataset',
    file_list='data/data10954/ImageNetDataset/val_list.txt',
    label_list='data/data10954/ImageNetDataset/labels.txt',
    transforms=eval_transforms_modify)

model.evaluate(eval_dataset=eval_dataset, batch_size=64)

OrderedDict([('acc1', 0.9583333), ('acc5', 1.0)])

It should be noted that if Eval is not specified during model training_ Dataset, the model saved in the training process will not have the built-in transformation method during evaluation. At this time, it needs to be in model Model. Needs to be specified in predict() predit(image, transforms=eval_transforms).

# We set the evaluation transformation during training, so the saved model comes with us
model.get_model_info()['Transforms']

[{'ResizeByShort': {'short_size': 256, 'max_size': -1, 'interp': 'LINEAR'}},
 {'CenterCrop': {'crop_size': 224}},
 {'Normalize': {'mean': [0.485, 0.456, 0.406],
   'std': [0.229, 0.224, 0.225],
   'min_val': [0, 0, 0],
   'max_val': [255.0, 255.0, 255.0],
   'is_scale': True}}]

6.2 model prediction

Extract some images in the validation set for visualization.

df_val = pd.read_csv('data/data10954/ImageNetDataset/val_list.txt', header=None, sep='\s+')
df_val.columns = ['path', 'cls']
df_val = df_val.sample(n=12, replace=False)
df_val.index = range(len(df_val))

import matplotlib.pyplot as plt
%matplotlib inline

plt.figure(figsize=(8, 10))
for i in range(12):
    plt.subplot(4, 3, i+1)
    plt.axis('off')
    image = cv2.imread(os.path.join('data/data10954/ImageNetDataset', df_val.at[i, 'path']))
    result = model.predict(image)[0]
    plt.title("%d (True) / %d (Predict) - %.4f" % (df_val.at[i, 'cls'], result['category_id'], result['score']))
    plt.imshow(image[:, :, [2, 1, 0]])

plt.tight_layout()
plt.show()

Generate the competition submission file in the left column work / result Click download under CSV.
In order to avoid reading abnormal pictures in the test set, PIL is also used here Image. open. Convert ('rgb ') to convert the picture mode, and then convert it to numpy Ndarray (note that the RGB channel is converted to BGR).

test_list = sorted(glob.glob('data/data10954/cat_12_test/*.jpg'))
test_df = pd.DataFrame()

for i in range(len(test_list)):
    img = Image.open(test_list[i]).convert('RGB')
    img = np.asarray(img, dtype='float32')
    img = img[:, :, [2, 1, 0]]

    result = model.predict(img)

    test_df.at[i, 'name'] = str(test_list[i]).split('/')[-1]
    test_df.at[i, 'cls'] = int(result[0]['category_id'])

test_df[['name']] = test_df[['name']].astype(str)
test_df[['cls']] = test_df[['cls']].astype(int)

/')[-1]
    test_df.at[i, 'cls'] = int(result[0]['category_id'])

test_df[['name']] = test_df[['name']].astype(str)
test_df[['cls']] = test_df[['cls']].astype(int)

test_df.to_csv('work/result.csv', index=False, header=False)

Get the result file Question 1: Twelve cat classifications In, the generated version is updated csv submission, get the calculation force and integral (a 0.954 file is saved in this version).

7 project summary

The above is about the whole process of applying PaddleX to cat image twelve classification competition.
It should be noted that the complexity of program code is not necessarily positively correlated with the submission score, and the specific steps and parameters can be modified by yourself.
Information about the interpretability of the model can be referred to PaddleX 1.3.11 version document.
More image classification techniques can be referred to PaddleClas - Ticks.

Keywords: Computer Vision Deep Learning paddlepaddle

Added by zoreli on Fri, 10 Dec 2021 02:38:56 +0200

Programming VIP