Notes on Lesson 3 of AI talent Creation Camp of flying oar pilot group

Feeling after class

The course is getting more and more difficult. It's a little difficult to keep up. However, thanks to the leaders in the group and various documents of the propeller platform, many difficulties were solved

Job completion record

Make dataset

The data set I made before was too small and the training effect was very poor, so I made a new data set. Recently, I was watching the Dragon maid of the Kobayashi family and wanted to make a relevant data set, so I cut 100 pictures to complete the data set.

For the production of data set, please refer to my second note

Training model

Reference case

At first, I didn't know how to start, so I searched the tutorial on the propeller platform, and I found this:
Smoking recognition and prediction using custom data sets

Code interpretation

The code in the above reference case will be explained in detail below

unzip

!unzip -oq data/data94796/pp_smoke.zip -d work/

unzip is a decompression command. The following "- oq" and "- d" are the parameters of the command. For example, - d means to extract to a certain place. In the above code, followed by work / is to extract to the work folder. For other parameters, please refer to here:
Parameter interpretation

git

! git clone https://gitee.com/paddlepaddle/PaddleDetection.git

git is a command to download things from the Internet.
If the network speed is slow, you can directly download the paddedetection to your computer and upload it to the project when you need it.
Installation code:

!unzip -oq /home/aistudio/PaddleDetection-release-2.1.zip
!pip install -r /home/aistudio/PaddleDetection-release-2.1/requirements.txt

There are many tools in paddedetection, which is very convenient

Interpretation of data set division

import random
import os
#Generate train Txt and val.txt
#train.txt and val.txt are the division of data sets. See the following ratio=0.9, which means that it is divided into training set and verification set according to 9:1
random.seed(2020)
xml_dir  = '/home/aistudio/work/Annotations'#Label file address
#The xml suffix file is a label file. Open the Annotations folder and you can see that the file inside is an xml suffix
img_dir = '/home/aistudio/work/images'#Image file address
#The file address is for storing pictures, which corresponds to the above file
path_list = list()
for img in os.listdir(img_dir):
#os.listdir will return the files in the specified path
    img_path = os.path.join(img_dir,img)
    #os.path.join is to put two or more together
    xml_path = os.path.join(xml_dir,img.replace('jpg', 'xml'))
    path_list.append((img_path, xml_path))#Map labels to pictures
    
#Maybe some students still don't know what it means. For example, it may be easier to understand
#Known img_dir='/home/aistudio/work/images' path is full of pictures
#Suppose these pictures are jpg with names of 1 ~ 20
#So OS Listdir (img_dir) returns 1 jpg,2.jpg……20.jpg
#for img in os.listdir(img_dir) is each jpg above
#img_path = os.path.join(img_dir,img) is spliced as follows:
#/home/aistudio/work/images/1.jpg
#/home/aistudio/work/images/2.jpg
#......
#xml_path gets / home / aistudio / work / images / 2 xml
#The only difference between the two is to replace the last jpg with xml
#path_list is to put img on it_ Path and XML_ Connect the path

random.shuffle(path_list)#Shuffle means shuffle, which means to disrupt the data set
ratio = 0.9
train_f = open('/home/aistudio/work/train.txt','w') #Generate training files
#Open means open, and the following 'w' is the write command, that is, write
val_f = open('/home/aistudio/work/val.txt' ,'w')#Generate validation file

for i ,content in enumerate(path_list):
#enumerate(path_list) is an enumeration
#i. Content is the index and value respectively
#It can be understood as follows: path_list is an array in C language
#i is the sequence number of the array, which means the ith; Content is the content of the ith
    img, xml = content
    text = img + ' ' + xml + '\n'
    if i < len(path_list) * ratio:
        train_f.write(text)#write in
    else:
        val_f.write(text)
train_f.close()
val_f.close()

#Generate label document
label = ['smoke']#Set the category you want to detect
with open('/home/aistudio/work/label_list.txt', 'w') as f:
    for text in label:
        f.write(text+'\n')

File selection

In the case, paddedetection-release-2.1/configs/ppyolo/ppyolov2 is used_ r50vd_ dcn_ voc. YML, you can also choose other

Open paddedetection-release-2.1/configurations/ppyolo/ppyolov2_ r50vd_ dcn_ voc. YML path, you will see:

The red circle in the figure above should be noted. Modify the first VOC The YML file is ready to use, but other parameters can be modified for better use


The above paths are train Txt, val.txt and label_list.txt file path

After modifying the parameters, you can train. There should be basically no problem

Problem solving

  1. Wrong ValueError:The device should not be "gpu", since PaddlePaddle is not compiled compiled, replace to advanced or Extreme Edition.
  2. If the decompressed dataset is running all the time, there may be problems with some pictures. Delete them

Keywords: AI

Added by rathersurf on Sun, 02 Jan 2022 23:19:55 +0200