Detailed interpretation of Detectron2 "quick start" detection tutorial lab notebook

Detectron quick start

official Colab Notebook Read the Getting Started section on:

1. Use the pre trained Detectron2 model

To download an image, we need to create a detectron2 config, and then create a Default Predictor according to the config for single image reasoning.

cfg = get_cfg() # Get Default Config
# According to mask_ rcnn_ R_ 50_ FPN_ 3x. Update config from yaml's configuration file
# The display threshold for reasoning is set to 0.5
# Select the corresponding trained model according to the configuration file
cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml")
# Create DefaultPredictor
predictor = DefaultPredictor(cfg)
# reasoning
outputs = predictor(im)

Get network output results:


We can use the Visualizer class to visualize the output results:

v = Visualizer(im[:,:,::-1], MetadataCatalog.get(cfg.DATASETS.TRAIN[0]), scale=1.2)
out = v.draw_instance_predictions(outputs['instances'].to('cpu'))

2. Training on custom datasets

Prepare dataset

In this section, we learn how to train on a custom dataset. We take the balloon segmentation dataset as an example. This dataset has only one class: balloon. We use a model pre trained on a COCO dataset.

After preparing the data set, we try to convert the data set into COCO format. For the data set in COCO format, we can directly register the data set with the following methods:

from import register_coco_instances

If our dataset is in its own format, we need to customize a function to input the dataset in its own format into detectron2. First, we need to define get_ balloon_ The dicts (img_dir) method to get the dataset from the dataset folder_ Dicts, and then through datasetcatalog The register method sets the dataset_dicts is registered in detectron2. Use metadatacatalog. When training Get to get the dataset.

from detectron2.structures import BoxMode

def get_balloon_dicts(img_dir):
    json_file = os.path.join(img_dir, "via_region_data.json")
    with open(json_file) as f:
        imgs_anns = json.load(f)
    dataset_dicts = []
    for idx, v in enumerate(imgs_anns.values()):
        record = {}
        filename = os.path.join(img_dir, v["filename"])
        height, width = cv2.imread(filename).shape[:2]
        For each picture, we need to record: 
        1.File name of the picture, 2.Picture number, 3.Picture height, 4.Width of picture		
        record["file_name"] = filename
        record["image_id"] = idx
        record["height"] = height
        record["width"] = width
        annos = v["regions"]
        objs = []
        # For each annotation of a single picture
        for _, anno in annos.items():
            assert not anno["region_attributes"]
            anno = anno["shape_attributes"]
            px = anno["all_points_x"]
            py = anno["all_points_y"]
            poly = [(x+0.5, y+0.5) for x, y in zip(px, py)]
            poly = [p for x in poly for p in x]
            # Create a dictionary of objects
            obj = {
                "bbox":[np.min(px), np.min(py), np.max(px), np.max(py)], # The object contour can be converted into a frame at the same time
                "bbox_mode": BoxMode.XYXY_ABS,
                "segmentation": [poly],
                "category_id": 0,
        # Put the converted dictionary into record
        record["annotations"] = objs
        # Add the information of a single picture to the dataset_dict
    return dataset_dicts

for d in ["train", "val"]:
    DatasetCatalog.register("balloon_"+d, lambda d=d:get_ballon_dicts("balloon/"+d))

ballon_metadata = MetadataCatalog.get("balloon_train")

After these work, we can test whether the data set is registered correctly:

dataset_dicts = get_ballon_dicts("balloon/train")
for d in random.sample(dataset_dicts, 3):
    img = cv2.imread(d["filename"])
    visualizer = Visualizer(img[:,:,::-1], metadata=ballon_metadata, scale=0.5)
    out = visualizer.draw_dataset_dict(d)
    #cv2_imshow method is not in opencv. You can use CV2 Imshow to display the picture.


After preparing the data set, we can start training. As an example, we use the R50-FPN Mask R-CNN model trained on the COCO data set.

from detectron2.engine import DefaultTrainer

cfg = get_cfg()
# The "balloon_train" data set has been registered, and the training data set is specified here
cfg.DATASETS.TRAIN = ("balloon_train",)
# Load pre training model
cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml")
# Some basic parameters of training
cfg.SOLVER.BASE_LR = 0.00025

os.makedirs(cfg.OUTPUT_DIR, exist_ok=True)
trainer = DefaultTrainer(cfg) 
#Call trainer Train() start training

The training curve can be visualized on the tensorboard:

%load_ext tensorboard
%tensorboard --logdir output

Reasoning and Evaluation

Now that the model has been trained, we need to evaluate the model and single picture reasoning. We need to create a predictor. The config used in reasoning must be consistent with that used in training. We only need to make some minor changes to the config:

cfg.MODEL.WEIGHTS = os.path.join(cfg.OUTPUT_DIR, "model_final.pth") # Good training model loading
cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.7 # Only boxes with a score above 0.7 can be visualized
predictor = DefaultPredictor(cfg)

Then we randomly select several pictures from the test set:

from detectron2.utils.visualizer import ColorMode
dataset_dicts = get_balloon_dicts("balloon/val")
for d in random.sample(dataset_dicts, 3):    
    im = cv2.imread(d["file_name"])
    outputs = predictor(im)
    # see outputs For the format of, please browse:
    v = Visualizer(im[:, :, ::-1],
    out = v.draw_instance_predictions(outputs["instances"].to("cpu"))
    cv2_imshow(out.get_image()[:, :, ::-1])

We can use COCOEvaluator to evaluate the AP and other performance of data sets:

from detectron2.evaluation import COCOEvaluator, inference_on_dataset
from import build_detection_test_loader

evaluator = COCOEvaluator("balloon_val", ("bbox", "segm"), False, output_dir="./output/")
val_loader = build_detection_test_loader(cfg, "balloon_val")
print(inference_on_dataset(trainer.model, val_loader, evaluator))

The training of other models is the same.

Perform panoramic segmentation on video (omitted)

Keywords: neural networks Pytorch Computer Vision Deep Learning CV

Added by plimpton on Sat, 19 Feb 2022 05:59:11 +0200