Detectron quick start
official Colab Notebook Read the Getting Started section on:
1. Use the pre trained Detectron2 model
To download an image, we need to create a detectron2 config, and then create a Default Predictor according to the config for single image reasoning.
cfg = get_cfg() # Get Default Config # According to mask_ rcnn_ R_ 50_ FPN_ 3x. Update config from yaml's configuration file cfg.merge_from_file(model_zoo.get_config_file("COCOInstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml")) # The display threshold for reasoning is set to 0.5 cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.5 # Select the corresponding trained model according to the configuration file cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml") # Create DefaultPredictor predictor = DefaultPredictor(cfg) # reasoning outputs = predictor(im)
Get network output results:
print(outputs['instances'].pred_classes) print(outputs['instances'].pred_boxes)
We can use the Visualizer class to visualize the output results:
v = Visualizer(im[:,:,::-1], MetadataCatalog.get(cfg.DATASETS.TRAIN[0]), scale=1.2) out = v.draw_instance_predictions(outputs['instances'].to('cpu')) cv2_imshow(out.get_image()[:,:,::-1])
2. Training on custom datasets
Prepare dataset
In this section, we learn how to train on a custom dataset. We take the balloon segmentation dataset as an example. This dataset has only one class: balloon. We use a model pre trained on a COCO dataset.
After preparing the data set, we try to convert the data set into COCO format. For the data set in COCO format, we can directly register the data set with the following methods:
from detectron2.data.datasets import register_coco_instances register_coco_instances( "my_dataset_train", {}, "json_annotation_train.json", "path/to/image/dir" ) register_coco_instances( "my_dataset_val", {}, "json_annotation_train.json", "path/to/image/dir" )
If our dataset is in its own format, we need to customize a function to input the dataset in its own format into detectron2. First, we need to define get_ balloon_ The dicts (img_dir) method to get the dataset from the dataset folder_ Dicts, and then through datasetcatalog The register method sets the dataset_dicts is registered in detectron2. Use metadatacatalog. When training Get to get the dataset.
from detectron2.structures import BoxMode def get_balloon_dicts(img_dir): json_file = os.path.join(img_dir, "via_region_data.json") with open(json_file) as f: imgs_anns = json.load(f) dataset_dicts = [] for idx, v in enumerate(imgs_anns.values()): record = {} filename = os.path.join(img_dir, v["filename"]) height, width = cv2.imread(filename).shape[:2] """ For each picture, we need to record: 1.File name of the picture, 2.Picture number, 3.Picture height, 4.Width of picture """ record["file_name"] = filename record["image_id"] = idx record["height"] = height record["width"] = width annos = v["regions"] objs = [] # For each annotation of a single picture for _, anno in annos.items(): assert not anno["region_attributes"] anno = anno["shape_attributes"] px = anno["all_points_x"] py = anno["all_points_y"] poly = [(x+0.5, y+0.5) for x, y in zip(px, py)] poly = [p for x in poly for p in x] # Create a dictionary of objects obj = { "bbox":[np.min(px), np.min(py), np.max(px), np.max(py)], # The object contour can be converted into a frame at the same time "bbox_mode": BoxMode.XYXY_ABS, "segmentation": [poly], "category_id": 0, } objs.append(obj) # Put the converted dictionary into record record["annotations"] = objs # Add the information of a single picture to the dataset_dict dataset_dicts.append(record) return dataset_dicts for d in ["train", "val"]: DatasetCatalog.register("balloon_"+d, lambda d=d:get_ballon_dicts("balloon/"+d)) MetadataCatalog.get("balloon_"+d).set(things_classes=["balloon"]) ballon_metadata = MetadataCatalog.get("balloon_train")
After these work, we can test whether the data set is registered correctly:
dataset_dicts = get_ballon_dicts("balloon/train") for d in random.sample(dataset_dicts, 3): img = cv2.imread(d["filename"]) visualizer = Visualizer(img[:,:,::-1], metadata=ballon_metadata, scale=0.5) out = visualizer.draw_dataset_dict(d) cv2_imshow(out.get_image()[:,:,::-1]) #cv2_imshow method is not in opencv. You can use CV2 Imshow to display the picture.
train
After preparing the data set, we can start training. As an example, we use the R50-FPN Mask R-CNN model trained on the COCO data set.
from detectron2.engine import DefaultTrainer cfg = get_cfg() cfg.merge_from_file(model_zoo.get_config_file("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml")) # The "balloon_train" data set has been registered, and the training data set is specified here cfg.DATASETS.TRAIN = ("balloon_train",) # Load pre training model cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml") # Some basic parameters of training cfg.SOLVER.IMS_PER_BATCH = 2 cfg.SOLVER.BASE_LR = 0.00025 cfg.SOLVER.MAX_ITER = 300 cfg.SOLVER.STEPS = [] cfg.MODEL.ROI_HEADS.BATCH_SIZE_PER_IMAGE = 128 cfg.MODEL.ROI_HEADS.NUM_CLASSES = 1 os.makedirs(cfg.OUTPUT_DIR, exist_ok=True) trainer = DefaultTrainer(cfg) trainer.resume_or_load(resume=False) #Call trainer Train() start training trainer.train()
The training curve can be visualized on the tensorboard:
%load_ext tensorboard %tensorboard --logdir output
Reasoning and Evaluation
Now that the model has been trained, we need to evaluate the model and single picture reasoning. We need to create a predictor. The config used in reasoning must be consistent with that used in training. We only need to make some minor changes to the config:
cfg.MODEL.WEIGHTS = os.path.join(cfg.OUTPUT_DIR, "model_final.pth") # Good training model loading cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.7 # Only boxes with a score above 0.7 can be visualized predictor = DefaultPredictor(cfg)
Then we randomly select several pictures from the test set:
from detectron2.utils.visualizer import ColorMode dataset_dicts = get_balloon_dicts("balloon/val") for d in random.sample(dataset_dicts, 3): im = cv2.imread(d["file_name"]) outputs = predictor(im) # see outputs For the format of, please browse: https://detectron2.readthedocs.io/tutorials/models.html#model-output-format v = Visualizer(im[:, :, ::-1], metadata=balloon_metadata, scale=0.5, instance_mode=ColorMode.IMAGE_BW ) out = v.draw_instance_predictions(outputs["instances"].to("cpu")) cv2_imshow(out.get_image()[:, :, ::-1])
We can use COCOEvaluator to evaluate the AP and other performance of data sets:
from detectron2.evaluation import COCOEvaluator, inference_on_dataset from detectron2.data import build_detection_test_loader evaluator = COCOEvaluator("balloon_val", ("bbox", "segm"), False, output_dir="./output/") val_loader = build_detection_test_loader(cfg, "balloon_val") print(inference_on_dataset(trainer.model, val_loader, evaluator))
The training of other models is the same.