Detailed explanation of target detection using paddleX


Using Baidu's open source paddleX tool, we can easily and quickly train a deep network model of target detection, image classification, instance segmentation and semantic segmentation using our own labeled data. This paper mainly records how to use pddleX to train a simple ppyolo for detecting cats and dogs in the whole process_ Tiny model.

(1) Data preparation

For the pictures here, we directly search "cat and dog" on Baidu pictures, randomly download 10 pictures and save them in the "JPEGImages folder".

(2) Use the labelme annotation tool to annotate

(1) labelme install & start

#anaconda is installed on premise
conda activate my_paddlex
conda install pyqt
pip install labelme

conda activate my_paddlex

(2) Target box dimensions

  1. Open the rectangle annotation tool (right-click menu - > create rectangle), as shown in the following figure

  2. Drag and drop to identify the target object, and write the corresponding label in the pop-up dialog box (click when the label already exists, please note that the label does not use Chinese here), as shown in the following figure. When the box is marked incorrectly, you can click "Edit Polygons" on the left and then click the marking box to modify it by dragging, or click "Delete Polygon" Delete.

  3. Click "Save" on the right to Save the annotation results to the Annotations directory of the folder created in

(3) More types of dimensions

This part directly copies the documents in paddleX Data annotation.

See documentation for details Image classification data annotation

See documentation for details Target detection data annotation

See documentation for details Instance split data annotation

See documentation for details Semantic segmentation data annotation

(3) Use the tools provided by paddlex to convert the data marked by labelme into data in VOC format

The data marked by LabelMe needs to be converted to PascalVOC or MSCOCO format before it can be used for target detection task training. Create D:\dataset_voc directory. After installing paddlex in python environment, use the following commands

paddlex --data_conversion --source labelme --to PascalVOC \
        --pics D:\MyDataset\JPEGImages \
        --annotations D:\MyDataset\Annotations \
        --save_dir D:\dataset_voc

For detailed usage, refer to Official documents

(4) Data segmentation

object detection
The data set can be randomly divided into 70% training set, 20% verification set and 10% test set by using the paddlex command:

paddlex --split_dataset --format VOC --dataset_dir D:\MyDataset --val_value 0.2 --test_value 0.1

Executing the above command line will generate labels.txt, train_list.txt, val_list.txt and test_list.txt under D:\MyDataset to store category information, training sample list, verification sample list and test sample list respectively
For detailed usage, refer to Official documents

(5) Data loading

This paper introduces the reading of detection data set in PascalVOC format. The reference code is in the following complete code. The reading of detection data set in MSCOCO format and semantic segmentation task data set is for reference Official documents

(6) Data enhancement

reference resources Official documents

(7) Model import

Model in paddlex.det

# detection
YOLOv3 = cv.models.YOLOv3
FasterRCNN = cv.models.FasterRCNN
PPYOLO = cv.models.PPYOLO
PPYOLOTiny = cv.models.PPYOLOTiny
PPYOLOv2 = cv.models.PPYOLOv2

# instance segmentation
MaskRCNN = cv.models.MaskRCNN

Model in paddlex.det

UNet = cv.models.UNet
DeepLabV3P = cv.models.DeepLabV3P
FastSCNN = cv.models.FastSCNN
HRNet = cv.models.HRNet
BiSeNetV2 = cv.models.BiSeNetV2

Model in paddlex.cls

ResNet18 = cv.models.ResNet18
ResNet34 = cv.models.ResNet34
ResNet50 = cv.models.ResNet50
ResNet101 = cv.models.ResNet101
ResNet152 = cv.models.ResNet152

ResNet18_vd = cv.models.ResNet18_vd
ResNet34_vd = cv.models.ResNet34_vd
ResNet50_vd = cv.models.ResNet50_vd
ResNet50_vd_ssld = cv.models.ResNet50_vd_ssld
ResNet101_vd = cv.models.ResNet101_vd
ResNet101_vd_ssld = cv.models.ResNet101_vd_ssld
ResNet152_vd = cv.models.ResNet152_vd
ResNet200_vd = cv.models.ResNet200_vd

MobileNetV1 = cv.models.MobileNetV1
MobileNetV2 = cv.models.MobileNetV2
MobileNetV3_small = cv.models.MobileNetV3_small
MobileNetV3_small_ssld = cv.models.MobileNetV3_small_ssld
MobileNetV3_large = cv.models.MobileNetV3_large
MobileNetV3_large_ssld = cv.models.MobileNetV3_large_ssld

AlexNet = cv.models.AlexNet

DarkNet53 = cv.models.DarkNet53

DenseNet121 = cv.models.DenseNet121
DenseNet161 = cv.models.DenseNet161
DenseNet169 = cv.models.DenseNet169
DenseNet201 = cv.models.DenseNet201
DenseNet264 = cv.models.DenseNet264

HRNet_W18_C = cv.models.HRNet_W18_C
HRNet_W30_C = cv.models.HRNet_W30_C
HRNet_W32_C = cv.models.HRNet_W32_C
HRNet_W40_C = cv.models.HRNet_W40_C
HRNet_W44_C = cv.models.HRNet_W44_C
HRNet_W48_C = cv.models.HRNet_W48_C
HRNet_W64_C = cv.models.HRNet_W64_C

Xception41 = cv.models.Xception41
Xception65 = cv.models.Xception65
Xception71 = cv.models.Xception71

ShuffleNetV2 = cv.models.ShuffleNetV2
ShuffleNetV2_swish = cv.models.ShuffleNetV2_swish

The following are the official documents of various models. In the following complete code, the example code
Image classification model API
Target detection model API
Instance segmentation model API
Image segmentation model API
Model loading API

(8) Model training and parameter adjustment

model training
Training parameter adjustment

(9) Complete code

import paddlex as pdx
from paddlex import transforms as T

#Data enhancement
train_transforms = T.Compose([
    T.MixupImage(mixup_epoch=-1), T.RandomDistort(),
    T.RandomExpand(im_padding_value=[123.675, 116.28, 103.53]), T.RandomCrop(),
    T.RandomHorizontalFlip(), T.BatchRandomResize(
        target_sizes=[192, 224, 256, 288, 320, 352, 384, 416, 448, 480, 512],
        interp='RANDOM'), T.Normalize(
            mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])

eval_transforms = T.Compose([
        target_size=320, interp='CUBIC'), T.Normalize(
            mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])

#Data import
train_dataset = pdx.datasets.VOCDetection(
eval_dataset = pdx.datasets.VOCDetection(

#Start training
num_classes = len(train_dataset.labels)
model = pdx.det.PPYOLOTiny(num_classes=num_classes)
    lr_decay_epochs=[130, 540],

#Use model
model = pdx.load_model('output/ppyolotiny/best_model')
image_name = 'insect_det/JPEGImages/0217.jpg'
result = model.predict(image_name)
pdx.det.visualize(image_name, result, threshold=0.5, save_dir='./output/ppyolotiny')

Keywords: Machine Learning neural networks Deep Learning

Added by 01chris on Wed, 13 Oct 2021 21:42:36 +0300