8-15(yolov3 training sample)

The test environment is
- Python 2.7
- Keras 2.2.4
- tensorflow 1.10.0

  1. Download keras-yolov3 https://github.com/qwweee/keras-yolo3
  2. According to its readme, let's study train.py.
    Regardless of those parameters and training codes, first find the sample-related processing:
    17 Sample Description File annotation_path='train.txt'Note: File Description Format
    41-48 Segmentation of training set and verification set here
    val_split = 0.1
    with open(annotation_path) as f:
        lines = f.readlines()
    np.random.seed(10101)
    np.random.shuffle(lines)
    np.random.seed(None)
    num_val = int(len(lines)*val_split)
    num_train = len(lines) - num_val

59, training: model.fit_generator(data_generator_wrapper(...)
184 data_generator_wrapper -> 165

175image, box = get_random_data(annotation_lines[i], input_shape, random=True)
Here annotation_lines are lines[:,part]
Now look at get_random_data of Yolo 3.utils, and then look back at the relevant code of train.py.

def get_random_data(annotation_line, input_shape, random=True, max_boxes=20, jitter=.3, hue=.1, sat=1.5, val=1.5, proc_img=True):
    '''random preprocessing for real-time data augmentation'''

	# Spaces as delimiters, containing\n
    line = annotation_line.split()

	#Take the first picture
    image = Image.open(line[0])
    iw, ih = image.size
    h, w = input_shape
    
	#Character segmentation, with','segmentation, from the second bit to take the split str into int. Then to the matrix.
    #     path/to/img2.jpg 120,300,250,600,2    Box format: `x_min,y_min,x_max,y_max,class_id` (no space).
    box = np.array([np.array(list(map(int,box.split(',')))) for box in line[1:]]) 

    if not random:
        # resize image
        scale = min(w/iw, h/ih)
        nw = int(iw*scale)
        nh = int(ih*scale)
        dx = (w-nw)//2
        dy = (h-nh)//2
        image_data=0
        if proc_img:
            image = image.resize((nw,nh), Image.BICUBIC)
            new_image = Image.new('RGB', (w,h), (128,128,128))
            new_image.paste(image, (dx, dy))
            image_data = np.array(new_image)/255.

        # correct boxes
        box_data = np.zeros((max_boxes,5))
        if len(box)>0:
            np.random.shuffle(box)
            if len(box)>max_boxes: box = box[:max_boxes]
            box[:, [0,2]] = box[:, [0,2]]*scale + dx
            box[:, [1,3]] = box[:, [1,3]]*scale + dy
            box_data[:len(box)] = box

        return image_data, box_data

    # resize image
    new_ar = w/h * rand(1-jitter,1+jitter)/rand(1-jitter,1+jitter)
    scale = rand(.25, 2)
    if new_ar < 1:
        nh = int(scale*h)
        nw = int(nh*new_ar)
    else:
        nw = int(scale*w)
        nh = int(nw/new_ar)
    image = image.resize((nw,nh), Image.BICUBIC)

    # place image
    dx = int(rand(0, w-nw))
    dy = int(rand(0, h-nh))
    new_image = Image.new('RGB', (w,h), (128,128,128))
    new_image.paste(image, (dx, dy))
    image = new_image

    # flip image or not
    flip = rand()<.5
    if flip: image = image.transpose(Image.FLIP_LEFT_RIGHT)

    # distort image
    hue = rand(-hue, hue)
    sat = rand(1, sat) if rand()<.5 else 1/rand(1, sat)
    val = rand(1, val) if rand()<.5 else 1/rand(1, val)
    x = rgb_to_hsv(np.array(image)/255.)
    x[..., 0] += hue
    x[..., 0][x[..., 0]>1] -= 1
    x[..., 0][x[..., 0]<0] += 1
    x[..., 1] *= sat
    x[..., 2] *= val
    x[x>1] = 1
    x[x<0] = 0
    image_data = hsv_to_rgb(x) # numpy array, 0 to 1

    # correct boxes
    box_data = np.zeros((max_boxes,5))
    if len(box)>0:
        np.random.shuffle(box)
        box[:, [0,2]] = box[:, [0,2]]*nw/iw + dx
        box[:, [1,3]] = box[:, [1,3]]*nh/ih + dy
        if flip: box[:, [0,2]] = w - box[:, [2,0]]
        box[:, 0:2][box[:, 0:2]<0] = 0
        box[:, 2][box[:, 2]>w] = w
        box[:, 3][box[:, 3]>h] = h
        box_w = box[:, 2] - box[:, 0]
        box_h = box[:, 3] - box[:, 1]
        box = box[np.logical_and(box_w>1, box_h>1)] # discard invalid box
        if len(box)>max_boxes: box = box[:max_boxes]
        box_data[:len(box)] = box

    return image_data, box_data
  1. So after reading it, you can decide how to name it (although readme writes it clearly - -)
  2. The data set is ccpd_base
    Generate train.txt code:
# -*- coding:utf-8
import cv2
import os
import sys
import numpy as np
import random

traintxt = 'train.txt'
imgs_path = '/home/jiteng/private_app/dataset/ccpd/ccpd_dataset/ccpd_base/' #199998 items

img_names = os.listdir(imgs_path)
start_get = random.randint(0,180)
start_get*1000
img_names = img_names[start_get:start_get+10000]
print len(img_names)
f = open(traintxt,'a+')

for imgname in img_names:
    #imgname   --   025-95_113-154&383_386&473-386&473_177&454_154&383_363&402-0_0_22_27_27_33_16-37-15.jpg
    name = imgs_path + imgname
    image = cv2.imread(name)

    label = imgname.split('-')[2]      #154&383_386&473
    label = label.replace('&',',')
    label = label.replace('_',',')

    save_str = name + ' ' + label + ',0\n'
    print save_str

    f.write(save_str)

f.close()

  1. Next, train in the installation steps and select the original pretrained weights (not yet trained, then record)

Make sure you have run python convert.py -w yolov3.cfg yolov3.weights model_data/yolo_weights.h5
The file model_data/yolo_weights.h5 is used to load pretrained weights.
Modify train.py and start training.
python train.py
Use your trained weights or checkpoint weights with command line option --model model_file when using yolo_video.py
Remember to modify class path or anchor path, with --classes class_file and --anchors anchor_file.
If you want to use original pretrained weights for YOLOv3:
1. wget https://pjreddie.com/media/files/darknet53.conv.74
2. rename it as darknet53.weights
3. python convert.py -w darknet53.cfg darknet53.weights model_data/darknet53_weights.h5
4. use model_data/darknet53_weights.h5 in train.py

  1. yes

Keywords: Python github

Added by ded on Mon, 07 Oct 2019 08:32:42 +0300