The test environment is
- Python 2.7
- Keras 2.2.4
- tensorflow 1.10.0
- Download keras-yolov3 https://github.com/qwweee/keras-yolo3
-
According to its readme, let's study train.py.
Regardless of those parameters and training codes, first find the sample-related processing:
17 Sample Description File annotation_path='train.txt'Note: File Description Format
41-48 Segmentation of training set and verification set here
val_split = 0.1 with open(annotation_path) as f: lines = f.readlines() np.random.seed(10101) np.random.shuffle(lines) np.random.seed(None) num_val = int(len(lines)*val_split) num_train = len(lines) - num_val
59, training: model.fit_generator(data_generator_wrapper(...)
184 data_generator_wrapper -> 165
175image, box = get_random_data(annotation_lines[i], input_shape, random=True)
Here annotation_lines are lines[:,part]
Now look at get_random_data of Yolo 3.utils, and then look back at the relevant code of train.py.
def get_random_data(annotation_line, input_shape, random=True, max_boxes=20, jitter=.3, hue=.1, sat=1.5, val=1.5, proc_img=True): '''random preprocessing for real-time data augmentation''' # Spaces as delimiters, containing\n line = annotation_line.split() #Take the first picture image = Image.open(line[0]) iw, ih = image.size h, w = input_shape #Character segmentation, with','segmentation, from the second bit to take the split str into int. Then to the matrix. # path/to/img2.jpg 120,300,250,600,2 Box format: `x_min,y_min,x_max,y_max,class_id` (no space). box = np.array([np.array(list(map(int,box.split(',')))) for box in line[1:]]) if not random: # resize image scale = min(w/iw, h/ih) nw = int(iw*scale) nh = int(ih*scale) dx = (w-nw)//2 dy = (h-nh)//2 image_data=0 if proc_img: image = image.resize((nw,nh), Image.BICUBIC) new_image = Image.new('RGB', (w,h), (128,128,128)) new_image.paste(image, (dx, dy)) image_data = np.array(new_image)/255. # correct boxes box_data = np.zeros((max_boxes,5)) if len(box)>0: np.random.shuffle(box) if len(box)>max_boxes: box = box[:max_boxes] box[:, [0,2]] = box[:, [0,2]]*scale + dx box[:, [1,3]] = box[:, [1,3]]*scale + dy box_data[:len(box)] = box return image_data, box_data # resize image new_ar = w/h * rand(1-jitter,1+jitter)/rand(1-jitter,1+jitter) scale = rand(.25, 2) if new_ar < 1: nh = int(scale*h) nw = int(nh*new_ar) else: nw = int(scale*w) nh = int(nw/new_ar) image = image.resize((nw,nh), Image.BICUBIC) # place image dx = int(rand(0, w-nw)) dy = int(rand(0, h-nh)) new_image = Image.new('RGB', (w,h), (128,128,128)) new_image.paste(image, (dx, dy)) image = new_image # flip image or not flip = rand()<.5 if flip: image = image.transpose(Image.FLIP_LEFT_RIGHT) # distort image hue = rand(-hue, hue) sat = rand(1, sat) if rand()<.5 else 1/rand(1, sat) val = rand(1, val) if rand()<.5 else 1/rand(1, val) x = rgb_to_hsv(np.array(image)/255.) x[..., 0] += hue x[..., 0][x[..., 0]>1] -= 1 x[..., 0][x[..., 0]<0] += 1 x[..., 1] *= sat x[..., 2] *= val x[x>1] = 1 x[x<0] = 0 image_data = hsv_to_rgb(x) # numpy array, 0 to 1 # correct boxes box_data = np.zeros((max_boxes,5)) if len(box)>0: np.random.shuffle(box) box[:, [0,2]] = box[:, [0,2]]*nw/iw + dx box[:, [1,3]] = box[:, [1,3]]*nh/ih + dy if flip: box[:, [0,2]] = w - box[:, [2,0]] box[:, 0:2][box[:, 0:2]<0] = 0 box[:, 2][box[:, 2]>w] = w box[:, 3][box[:, 3]>h] = h box_w = box[:, 2] - box[:, 0] box_h = box[:, 3] - box[:, 1] box = box[np.logical_and(box_w>1, box_h>1)] # discard invalid box if len(box)>max_boxes: box = box[:max_boxes] box_data[:len(box)] = box return image_data, box_data
- So after reading it, you can decide how to name it (although readme writes it clearly - -)
- The data set is ccpd_base
Generate train.txt code:
# -*- coding:utf-8 import cv2 import os import sys import numpy as np import random traintxt = 'train.txt' imgs_path = '/home/jiteng/private_app/dataset/ccpd/ccpd_dataset/ccpd_base/' #199998 items img_names = os.listdir(imgs_path) start_get = random.randint(0,180) start_get*1000 img_names = img_names[start_get:start_get+10000] print len(img_names) f = open(traintxt,'a+') for imgname in img_names: #imgname -- 025-95_113-154&383_386&473-386&473_177&454_154&383_363&402-0_0_22_27_27_33_16-37-15.jpg name = imgs_path + imgname image = cv2.imread(name) label = imgname.split('-')[2] #154&383_386&473 label = label.replace('&',',') label = label.replace('_',',') save_str = name + ' ' + label + ',0\n' print save_str f.write(save_str) f.close()
- Next, train in the installation steps and select the original pretrained weights (not yet trained, then record)
Make sure you have run
python convert.py -w yolov3.cfg yolov3.weights model_data/yolo_weights.h5
The file model_data/yolo_weights.h5 is used to load pretrained weights.
Modify train.py and start training.python train.py
Use your trained weights or checkpoint weights with command line option--model model_file
when using yolo_video.py
Remember to modify class path or anchor path, with--classes class_file
and--anchors anchor_file
.
If you want to use original pretrained weights for YOLOv3:
1.wget https://pjreddie.com/media/files/darknet53.conv.74
2. rename it as darknet53.weights
3.python convert.py -w darknet53.cfg darknet53.weights model_data/darknet53_weights.h5
4. use model_data/darknet53_weights.h5 in train.py
- yes