preface
Using Baidu's open source paddleX tool, we can easily and quickly train a deep network model of target detection, image classification, instance segmentation and semantic segmentation using our own labeled data. This paper mainly records how to use pddleX to train a simple ppyolo for detecting cats and dogs in the whole process_ Tiny model.
(1) Data preparation
For the pictures here, we directly search "cat and dog" on Baidu pictures, randomly download 10 pictures and save them in the "JPEGImages folder".
(2) Use the labelme annotation tool to annotate
(1) labelme install & start
#anaconda is installed on premise #install conda activate my_paddlex conda install pyqt pip install labelme #start-up conda activate my_paddlex labelme
(2) Target box dimensions
-
Open the rectangle annotation tool (right-click menu - > create rectangle), as shown in the following figure
-
Drag and drop to identify the target object, and write the corresponding label in the pop-up dialog box (click when the label already exists, please note that the label does not use Chinese here), as shown in the following figure. When the box is marked incorrectly, you can click "Edit Polygons" on the left and then click the marking box to modify it by dragging, or click "Delete Polygon" Delete.
-
Click "Save" on the right to Save the annotation results to the Annotations directory of the folder created in
(3) More types of dimensions
This part directly copies the documents in paddleX Data annotation.
See documentation for details Image classification data annotation
See documentation for details Target detection data annotation
See documentation for details Instance split data annotation
See documentation for details Semantic segmentation data annotation
(3) Use the tools provided by paddlex to convert the data marked by labelme into data in VOC format
The data marked by LabelMe needs to be converted to PascalVOC or MSCOCO format before it can be used for target detection task training. Create D:\dataset_voc directory. After installing paddlex in python environment, use the following commands
paddlex --data_conversion --source labelme --to PascalVOC \ --pics D:\MyDataset\JPEGImages \ --annotations D:\MyDataset\Annotations \ --save_dir D:\dataset_voc
For detailed usage, refer to Official documents
(4) Data segmentation
object detection
The data set can be randomly divided into 70% training set, 20% verification set and 10% test set by using the paddlex command:
paddlex --split_dataset --format VOC --dataset_dir D:\MyDataset --val_value 0.2 --test_value 0.1
Executing the above command line will generate labels.txt, train_list.txt, val_list.txt and test_list.txt under D:\MyDataset to store category information, training sample list, verification sample list and test sample list respectively
For detailed usage, refer to Official documents
(5) Data loading
This paper introduces the reading of detection data set in PascalVOC format. The reference code is in the following complete code. The reading of detection data set in MSCOCO format and semantic segmentation task data set is for reference Official documents
(6) Data enhancement
reference resources Official documents
(7) Model import
Model in paddlex.det
# detection YOLOv3 = cv.models.YOLOv3 FasterRCNN = cv.models.FasterRCNN PPYOLO = cv.models.PPYOLO PPYOLOTiny = cv.models.PPYOLOTiny PPYOLOv2 = cv.models.PPYOLOv2 # instance segmentation MaskRCNN = cv.models.MaskRCNN
Model in paddlex.det
UNet = cv.models.UNet DeepLabV3P = cv.models.DeepLabV3P FastSCNN = cv.models.FastSCNN HRNet = cv.models.HRNet BiSeNetV2 = cv.models.BiSeNetV2
Model in paddlex.cls
ResNet18 = cv.models.ResNet18 ResNet34 = cv.models.ResNet34 ResNet50 = cv.models.ResNet50 ResNet101 = cv.models.ResNet101 ResNet152 = cv.models.ResNet152 ResNet18_vd = cv.models.ResNet18_vd ResNet34_vd = cv.models.ResNet34_vd ResNet50_vd = cv.models.ResNet50_vd ResNet50_vd_ssld = cv.models.ResNet50_vd_ssld ResNet101_vd = cv.models.ResNet101_vd ResNet101_vd_ssld = cv.models.ResNet101_vd_ssld ResNet152_vd = cv.models.ResNet152_vd ResNet200_vd = cv.models.ResNet200_vd MobileNetV1 = cv.models.MobileNetV1 MobileNetV2 = cv.models.MobileNetV2 MobileNetV3_small = cv.models.MobileNetV3_small MobileNetV3_small_ssld = cv.models.MobileNetV3_small_ssld MobileNetV3_large = cv.models.MobileNetV3_large MobileNetV3_large_ssld = cv.models.MobileNetV3_large_ssld AlexNet = cv.models.AlexNet DarkNet53 = cv.models.DarkNet53 DenseNet121 = cv.models.DenseNet121 DenseNet161 = cv.models.DenseNet161 DenseNet169 = cv.models.DenseNet169 DenseNet201 = cv.models.DenseNet201 DenseNet264 = cv.models.DenseNet264 HRNet_W18_C = cv.models.HRNet_W18_C HRNet_W30_C = cv.models.HRNet_W30_C HRNet_W32_C = cv.models.HRNet_W32_C HRNet_W40_C = cv.models.HRNet_W40_C HRNet_W44_C = cv.models.HRNet_W44_C HRNet_W48_C = cv.models.HRNet_W48_C HRNet_W64_C = cv.models.HRNet_W64_C Xception41 = cv.models.Xception41 Xception65 = cv.models.Xception65 Xception71 = cv.models.Xception71 ShuffleNetV2 = cv.models.ShuffleNetV2 ShuffleNetV2_swish = cv.models.ShuffleNetV2_swish
The following are the official documents of various models. In the following complete code, the example code
Image classification model API
Target detection model API
Instance segmentation model API
Image segmentation model API
Model loading API
(8) Model training and parameter adjustment
model training
Training parameter adjustment
(9) Complete code
import paddlex as pdx from paddlex import transforms as T #Data enhancement train_transforms = T.Compose([ T.MixupImage(mixup_epoch=-1), T.RandomDistort(), T.RandomExpand(im_padding_value=[123.675, 116.28, 103.53]), T.RandomCrop(), T.RandomHorizontalFlip(), T.BatchRandomResize( target_sizes=[192, 224, 256, 288, 320, 352, 384, 416, 448, 480, 512], interp='RANDOM'), T.Normalize( mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]) ]) eval_transforms = T.Compose([ T.Resize( target_size=320, interp='CUBIC'), T.Normalize( mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]) ]) #Data import train_dataset = pdx.datasets.VOCDetection( data_dir='/home/libufan/desktop/catDog/voc', file_list='/home/libufan/desktop/catDog/voc/train_list.txt', label_list='/home/libufan/desktop/catDog/voc/labels.txt', transforms=train_transforms, shuffle=True) eval_dataset = pdx.datasets.VOCDetection( data_dir='/home/libufan/desktop/catDog/voc', file_list='/home/libufan/desktop/catDog/voc/val_list.txt', label_list='/home/libufan/desktop/catDog/voc/labels.txt', transforms=eval_transforms) #Start training num_classes = len(train_dataset.labels) model = pdx.det.PPYOLOTiny(num_classes=num_classes) model.train( num_epochs=100, train_dataset=train_dataset, train_batch_size=1, eval_dataset=eval_dataset, pretrain_weights='COCO', learning_rate=0.005, warmup_steps=1000, warmup_start_lr=0.0, lr_decay_epochs=[130, 540], lr_decay_gamma=.5, save_interval_epochs=20, save_dir='output/ppyolotiny', use_vdl=True) #Use model model = pdx.load_model('output/ppyolotiny/best_model') image_name = 'insect_det/JPEGImages/0217.jpg' result = model.predict(image_name) pdx.det.visualize(image_name, result, threshold=0.5, save_dir='./output/ppyolotiny') '''