MMdetection official Chinese document 1: reasoning on standard data sets using existing models

MMdetection official Chinese document 1: reasoning on standard data sets using existing models

MMDetection in Model Zoo It provides hundreds of detection models and supports a variety of standard data sets, including Pascal, VOC, COCO, Cityscapes, LVIS, etc. This document will describe how to use these models and standard data sets to run some common tasks, including:

  • Reasoning on a given picture using an existing model
  • Test existing models on standard datasets
  • Training predefined models on standard data sets

Reasoning using existing models

Reasoning refers to using the trained model to detect the target on the image. In MMDetection, a model is defined as a set of configuration files and corresponding model parameters stored in the checkpoint file.

First, we suggest starting from Faster RCNN Start, its to configure Documents and checkpoint The file is here.
We recommend downloading the checkpoint file to the checkpoints folder.

High level programming interface for reasoning

MMDetection provides Python's high-level programming interface for reasoning on pictures. The following are examples of modeling and reasoning on images or videos.

from mmdet.apis import init_detector, inference_detector
import mmcv

# Specify the configuration file and checkpoint file path of the model
config_file = 'configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py'
checkpoint_file = 'checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth'

# Build the model according to the configuration file and checkpoint file
model = init_detector(config_file, checkpoint_file, device='cuda:0')

# Test a single picture and show the results
img = 'test.jpg'  # Or img = mmcv.imread(img), so the picture will be read only once
result = inference_detector(model, img)
# Visualize the results in a new window
model.show_result(img, result)
# Or save the visualization results as pictures
model.show_result(img, result, out_file='result.jpg')

# Test the video and show the results
video = mmcv.VideoReader('video.mp4')
for frame in video:
    result = inference_detector(model, frame)
    model.show_result(frame, result, wait_time=1)

A demo sample on the Jupiter notebook is shown in demo/inference_demo.ipynb .

Asynchronous interface - supports Python 3.7+

For Python 3.7 +, MMDetection also has an asynchronous interface. Using CUDA stream, the reasoning code bound to GPU will not block CPU, so that CPU/GPU can achieve higher utilization in single threaded applications. In the reasoning process, the reasoning of different data samples and different models can run concurrently.

You can refer to tests/async_benchmark.py to compare the running speed of synchronous interface and asynchronous interface.

import asyncio
import torch
from mmdet.apis import init_detector, async_inference_detector
from mmdet.utils.contextmanagers import concurrent

async def main():
    config_file = 'configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py'
    checkpoint_file = 'checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth'
    device = 'cuda:0'
    model = init_detector(config_file, checkpoint=checkpoint_file, device=device)

    # This queue is used to infer multiple images in parallel
    streamqueue = asyncio.Queue()
    # The queue size defines the number of parallels
    streamqueue_size = 3

    for _ in range(streamqueue_size):
        streamqueue.put_nowait(torch.cuda.Stream(device=device))

    # Test a single picture and show the results
    img = 'test.jpg'  # Or or img = mmcv.imread(img), so the picture will be read only once

    async with concurrent(streamqueue):
        result = await async_inference_detector(model, img)

    # Visualize the results in a new window
    model.show_result(img, result)
    # Or save the visualization results as pictures
    model.show_result(img, result, out_file='result.jpg')


asyncio.run(main())

Demo sample

We also provide three demo scripts, which are implemented using high-level programming interfaces. Source code here .

Image sample

This is a script for reasoning on a single picture. You can start -- async test for asynchronous reasoning.

python demo/image_demo.py \
    ${IMAGE_FILE} \
    ${CONFIG_FILE} \
    ${CHECKPOINT_FILE} \
    [--device ${GPU_ID}] \
    [--score-thr ${SCORE_THR}] \
    [--async-test]

Run example:

python demo/image_demo.py demo/demo.jpg \
    configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py \
    checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth \
    --device cpu

Camera example

This is a reasoning script using real-time pictures of the camera.

python demo/webcam_demo.py \
    ${CONFIG_FILE} \
    ${CHECKPOINT_FILE} \
    [--device ${GPU_ID}] \
    [--camera-id ${CAMERA-ID}] \
    [--score-thr ${SCORE_THR}]

Run example:

python demo/webcam_demo.py \
    configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py \
    checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth

Video sample

This is a script for reasoning on the video sample.

python demo/video_demo.py \
    ${VIDEO_FILE} \
    ${CONFIG_FILE} \
    ${CHECKPOINT_FILE} \
    [--device ${GPU_ID}] \
    [--score-thr ${SCORE_THR}] \
    [--out ${OUT_FILE}] \
    [--show] \
    [--wait-time ${WAIT_TIME}]

Run example:

python demo/video_demo.py demo/demo.mp4 \
    configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py \
    checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth \
    --out result.mp4

Test existing models on standard datasets

In order to test the accuracy of a model, we usually test it on a standard data set. MMDetection supports multiple public data sets, including COCO
Pascal VOCCityscapes wait.
This section describes how to test an existing model on a supported dataset.

Dataset preparation

Some public data sets, such as Pascal VOC and its image data sets, or COCO data sets, can be obtained from official websites or image websites.
Note: in the detection task, Pascal VOC 2012 is a non intersection extension of Pascal VOC 2007. We usually use the two together.
We recommend downloading the dataset, extracting it to a folder outside the project, and then linking the root directory of the dataset to the $MMDETECTION/data folder through symbolic link. The format is as follows.
If your folder structure is different from that below, you need to change the corresponding path in the configuration file.

mmdetection
├── mmdet
├── tools
├── configs
├── data
│   ├── coco
│   │   ├── annotations
│   │   ├── train2017
│   │   ├── val2017
│   │   ├── test2017
│   ├── cityscapes
│   │   ├── annotations
│   │   ├── leftImg8bit
│   │   │   ├── train
│   │   │   ├── val
│   │   ├── gtFine
│   │   │   ├── train
│   │   │   ├── val
│   ├── VOCdevkit
│   │   ├── VOC2007
│   │   ├── VOC2012

Some models require additional COCO-stuff Data sets, such as HTC, DetectoRS and SCNet, can be downloaded and unzipped into the coco folder. The folder will be structured as follows:

mmdetection
├── data
│   ├── coco
│   │   ├── annotations
│   │   ├── train2017
│   │   ├── val2017
│   │   ├── test2017
│   │   ├── stuffthingmaps

Panoramic segmentation models such as PanopticFPN require additional COCO Panoptic Data sets, you can download and unzip them to the coco/annotations folder. The folder will be structured as follows:

mmdetection
├── data
│   ├── coco
│   │   ├── annotations
│   │   │   ├── panoptic_train2017.json
│   │   │   ├── panoptic_train2017
│   │   │   ├── panoptic_val2017.json
│   │   │   ├── panoptic_val2017
│   │   ├── train2017
│   │   ├── val2017
│   │   ├── test2017

The annotation format of Cityscape dataset needs to be converted to be consistent with the annotation format of COCO dataset. Use tools/dataset_converters/cityscapes.py to complete the conversion:

pip install cityscapesscripts

python tools/dataset_converters/cityscapes.py \
    ./data/cityscapes \
    --nproc 8 \
    --out-dir ./data/cityscapes/annotations

Testing existing models

We provide test scripts that can test the performance of an existing model on all data sets (COCO, Pascal, VOC, Cityscapes, etc.). We support testing in the following environments:

  • Single GPU test
  • Single node multi GPU test
  • Multi node test

According to the above test environment, select the appropriate script to execute the test process.

# Single GPU test
python tools/test_Registry.py \
    ${CONFIG_FILE} \
    ${CHECKPOINT_FILE} \
    [--out ${RESULT_FILE}] \
    [--eval ${EVAL_METRICS}] \
    [--show]

# Single node multi GPU test
bash tools/dist_test.sh \
    ${CONFIG_FILE} \
    ${CHECKPOINT_FILE} \
    ${GPU_NUM} \
    [--out ${RESULT_FILE}] \
    [--eval ${EVAL_METRICS}]

tools/dist_test.sh also supports multi node testing, but it depends on PyTorch Launch tool .

Optional parameters:

  • RESULT_FILE: the name of the result file, which needs to be stored in the form of. pkl. If there is no declaration, the results are not stored in a file.
  • EVAL_METRICS: metrics to be tested. The optional values depend on the data set, such as proposal_fast, proposal, bbox and segm are optional values of COCO dataset, and mAP and recall are optional values of Pascal VOC dataset. The cityscapes dataset can test the metrics supported by cityscapes and all COCO datasets.
  • --show: if enabled, the detection results will be drawn on the image and displayed in a new window. It is only applicable to the test of single GPU and is used for debugging and visualization. Make sure your GUI can be opened in the environment when using this feature. Otherwise, you may encounter an error cannot connect to X server.
  • --Show dir: if specified, the detection results will be drawn on the image and saved to the specified directory. It is only applicable to the test of single GPU and is used for debugging and visualization. This option is available even if there is no GUI in your environment.
  • --Show score thr: if specified, detection results with scores below this threshold will be removed.
  • --CFG options: if specified, the key value pairs here will be merged into the configuration file.
  • --Eval options: if specified, the key value pair here will be passed into the dataset. Evaluation() function as a dictionary parameter, which is only used in the test phase.

Sample

Suppose you have downloaded the checkpoint file to the checkpoints / file.

  1. Test fast r-cnn and visualize its results. Press any key to continue the test of the next picture. Configuration files and checkpoint files Here .

    python tools/test_Registry.py \
        configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py \
        checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth \
        --show
    
  2. Test fast r-cnn and save the drawn image for later visualization. Configuration files and checkpoint files Here .

    python tools/test_Registry.py \
        configs/faster_rcnn/faster_rcnn_r50_fpn_1x.py \
        checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth \
        --show-dir faster_rcnn_r50_fpn_1x_results
    
  3. Test fast r-cnn on Pascal VOC dataset without saving test results and test mAP. Configuration files and checkpoint files Here .

    python tools/test_Registry.py \
        configs/pascal_voc/faster_rcnn_r50_fpn_1x_voc.py \
        checkpoints/faster_rcnn_r50_fpn_1x_voc0712_20200624-c9895d40.pth \
        --eval mAP
    
  4. Use 8 GPU s to test Mask R-CNN, bbox and mAP. Configuration files and checkpoint files Here .

    ./tools/dist_test.sh \
        configs/mask_rcnn_r50_fpn_1x_coco.py \
        checkpoints/mask_rcnn_r50_fpn_1x_coco_20200205-d4b0c5d6.pth \
        8 \
        --out results.pkl \
        --eval bbox segm
    
  5. Test Mask R-CNN with 8 GPU s to test bbox and mAP of each type. Configuration files and checkpoint files Here .

    ./tools/dist_test.sh \
        configs/mask_rcnn/mask_rcnn_r50_fpn_1x_coco.py \
        checkpoints/mask_rcnn_r50_fpn_1x_coco_20200205-d4b0c5d6.pth \
        8 \
        --out results.pkl \
        --eval bbox segm \
        --options "classwise=True"
    
  6. On the coco test dev dataset, use 8 GPU s to test Mask R-CNN, generate JSON files and submit them to the official evaluation server. Configuration files and checkpoint files Here .

    ./tools/dist_test.sh \
        configs/mask_rcnn/mask_rcnn_r50_fpn_1x_coco.py \
        checkpoints/mask_rcnn_r50_fpn_1x_coco_20200205-d4b0c5d6.pth \
        8 \
        --format-only \
        --options "jsonfile_prefix=./mask_rcnn_test-dev_results"
    

This command generates two JSON file masks_ rcnn_ test-dev_ Results.bbox.json and mask_rcnn_test-dev_results.segm.json.

  1. On the Cityscapes dataset, 8 GPU s are used to test Mask R-CNN, generate txt and png files, and upload them to the official evaluation server. Configuration files and checkpoint files Here .

    ./tools/dist_test.sh \
        configs/cityscapes/mask_rcnn_r50_fpn_1x_cityscapes.py \
        checkpoints/mask_rcnn_r50_fpn_1x_cityscapes_20200227-afe51d5a.pth \
        8 \
        --format-only \
        --options "txtfile_prefix=./mask_rcnn_cityscapes_test_results"
    

The generated png and txt files are in. / mask_rcnn_cityscapes_test_results folder.

Do not use the Ground Truth dimension for testing

MMDetection supports testing the model without using ground truth annotation, which requires CocoDataset. If your dataset format is not COCO format, please convert it to COCO format. If your dataset format is VOC or Cityscapes, you can use tools/dataset_converters The script in converts it directly to COCO format. For other formats, you can use Images2cco script Convert.

python tools/dataset_converters/images2coco.py \
    ${IMG_PATH} \
    ${CLASSES} \
    ${OUT} \
    [--exclude-extensions]

Parameters:

  • IMG_PATH: the root path of the picture.
  • CLASSES: class list text file name. Each line in the text stores a category.
  • OUT: output json file name. Default save directory and IMG_PATH is at the same level.
  • Exclude extensions: file suffix to be excluded.

After the conversion, use the following command to test

# Single GPU test
python tools/test_Registry.py \
    ${CONFIG_FILE} \
    ${CHECKPOINT_FILE} \
    --format-only \
    --options ${JSONFILE_PREFIX} \
    [--show]

# Single node multi GPU test
bash tools/dist_test.sh \
    ${CONFIG_FILE} \
    ${CHECKPOINT_FILE} \
    ${GPU_NUM} \
    --format-only \
    --options ${JSONFILE_PREFIX} \
    [--show]

hypothesis model zoo The checkpoint file in is downloaded to the checkpoints / folder,
We can use the following command to test Mask R-CNN on the coco test dev dataset with 8 GPU s and generate JSON files.

./tools/dist_test.sh \
    configs/mask_rcnn/mask_rcnn_r50_fpn_1x_coco.py \
    checkpoints/mask_rcnn_r50_fpn_1x_coco_20200205-d4b0c5d6.pth \
    8 \
    -format-only \
    --options "jsonfile_prefix=./mask_rcnn_test-dev_results"

This command generates two JSON file masks_ rcnn_ test-dev_ Results.bbox.json and mask_rcnn_test-dev_results.segm.json.

Batch reasoning

In the test mode, MMDetection supports both single image reasoning and batch image reasoning. By default, we use a single picture test. You can modify the samples in the test data configuration file_ per_ GPU to start batch testing.
The modification method of the configuration file for batch reasoning is as follows:

data = dict(train=dict(...), val=dict(...), test=dict(samples_per_gpu=2, ...))

Or you can set -- CFG options to -- CFG options data.test.samples_ per_ GPU = 2 to turn it on.

Discard ImageToTensor

In test mode, discard the ImageToTensor process and replace it with DefaultFormatBundle. It is recommended to manually replace it in the configuration file of your test data process, such as:

# (deprecated) use ImageToTensor
pipelines = [
   dict(type='LoadImageFromFile'),
   dict(
       type='MultiScaleFlipAug',
       img_scale=(1333, 800),
       flip=False,
       transforms=[
           dict(type='Resize', keep_ratio=True),
           dict(type='RandomFlip'),
           dict(type='Normalize', mean=[0, 0, 0], std=[1, 1, 1]),
           dict(type='Pad', size_divisor=32),
           dict(type='ImageToTensor', keys=['img']),
           dict(type='Collect', keys=['img']),
       ])
   ]

# (recommended) manually replace ImageToTensor with DefaultFormatBundle
pipelines = [
   dict(type='LoadImageFromFile'),
   dict(
       type='MultiScaleFlipAug',
       img_scale=(1333, 800),
       flip=False,
       transforms=[
           dict(type='Resize', keep_ratio=True),
           dict(type='RandomFlip'),
           dict(type='Normalize', mean=[0, 0, 0], std=[1, 1, 1]),
           dict(type='Pad', size_divisor=32),
           dict(type='DefaultFormatBundle'),
           dict(type='Collect', keys=['img']),
       ])
   ]

Training predefined models on standard data sets

MMDetection also provides a ready to eat tool for training and detection models. This section will show how to train a predefined model on a standard dataset such as COCO.

Important information: batchsize = number of GPU * samples_per_gpu (in 'configurations / _base_ / datasets /... py'), the learning rate in the configuration file is set when there are 8 GPUs and each GPU has 2 images (the batch size is 8 * 2 = 16).
according to Linear extension rule , if you use a different number of GPUs or have a different number of pictures on each GPU, you need to set the learning rate to be proportional to the batch size, for example,
Set lr=0.01 when there are 4 GPUs and there are 2 pictures on each GPU; When there are 16 GPUs and there are 4 pictures on each GPU, lr=0.08 is set.

data set

Data sets need to be prepared for training. Please refer to for details Dataset preparation .

be careful:
At present, the configuration files in the configurations / cityscapes folder are initialized with COCO pre training weights. If the network connection is unavailable or slow, you can download the existing model in advance. Otherwise, errors may occur at the beginning of training.

Training with a single GPU

We provide tools/train.py to start training tasks on a single GPU. The basic usage is as follows:

python tools/train.py \
    ${CONFIG_FILE} \
    [optional arguments]

During the training, the log file and checkpoint file will be saved in the working directory. It needs to pass the work in the configuration file_ Dir or -- work dir in the CLI parameter.

By default, the model will be tested on the validation set after each round of training. The test frequency can be specified by setting the configuration file:

# Test evaluation is conducted every 12 iterations
evaluation = dict(interval=12)

This tool accepts the following parameters:

  • --No validate: close the test during training
  • --Work dir ${work_dir}: overwrite the working directory
  • --Resume from ${checkpoint_file}: continue training from a checkpoint file
  • --Options' key = value ': overrides other settings in the configuration file used

be careful:
The difference between resume from and load from:

Resume from not only loads the weight of the model and the status of the optimizer, but also inherits the number of iterations of the specified checkpoint and will not restart the training. Load from only loads the weight of the model. Its training starts from scratch and is often used to fine tune the model.

Training on multiple GPU s

We provide tools/dist_train.sh to start training on multiple GPU s. The basic usage is as follows:

bash ./tools/dist_train.sh \
    ${CONFIG_FILE} \
    ${GPU_NUM} \
    [optional arguments]

The optional parameters are consistent with those described in the previous section.

Start multiple tasks at the same time

If you want to start multiple tasks on one machine, such as starting two tasks requiring 4 GPUs on a machine with 8 GPUs, you need to specify different ports (29500 by default) for different training tasks to avoid conflict.

If you use dist_train.sh to start the training task. You can use the command to set the port.

CUDA_VISIBLE_DEVICES=0,1,2,3 PORT=29500 ./tools/dist_train.sh ${CONFIG_FILE} 4
CUDA_VISIBLE_DEVICES=4,5,6,7 PORT=29501 ./tools/dist_train.sh ${CONFIG_FILE} 4

Training on multiple nodes

MMDetection relies on torch.distributed package for distributed training. Therefore, we can use PyTorch's Launch tool For basic use.

Use Slurm to manage tasks

Slurm is a common computing cluster scheduling system. On the cluster managed by slurm, you can use slurm.sh to start training tasks. It supports both single node training and multi node training.

The basic usage is as follows:

[GPUS=${GPUS}] ./tools/slurm_train.sh ${PARTITION} ${JOB_NAME} ${CONFIG_FILE} ${WORK_DIR}

The following is an example of using 16 GPU s to train Mask R-CNN on a Slurm partition named dev, and setting work dir under some shared file systems.

GPUS=16 ./tools/slurm_train.sh dev mask_r50_1x configs/mask_rcnn_r50_fpn_1x_coco.py /nfs/xxxx/mask_rcnn_r50_fpn_1x

You can view Source code To check all parameters and environment variables

When using Slurm, the port needs to be set by one of the following methods.

  1. Set the port through -- options. We highly recommend this approach because it does not require changing the original configuration file.

    CUDA_VISIBLE_DEVICES=0,1,2,3 GPUS=4 ./tools/slurm_train.sh ${PARTITION} ${JOB_NAME} config1.py ${WORK_DIR} --options 'dist_params.port=29500'
    CUDA_VISIBLE_DEVICES=4,5,6,7 GPUS=4 ./tools/slurm_train.sh ${PARTITION} ${JOB_NAME} config2.py ${WORK_DIR} --options 'dist_params.port=29501'
    
  2. Modify the configuration file to set different AC ports.

    In config1.py, set:

    dist_params = dict(backend='nccl', port=29500)
    

    In config2.py, set:

    dist_params = dict(backend='nccl', port=29501)
    

    Then you can use config1.py and config2.py to start the two tasks.

    CUDA_VISIBLE_DEVICES=0,1,2,3 GPUS=4 ./tools/slurm_train.sh ${PARTITION} ${JOB_NAME} config1.py ${WORK_DIR}
    CUDA_VISIBLE_DEVICES=4,5,6,7 GPUS=4 ./tools/slurm_train.sh ${PARTITION} ${JOB_NAME} config2.py ${WORK_DIR}
    

Keywords: Pytorch Deep Learning Object Detection

Added by jjacquay712 on Sun, 17 Oct 2021 20:47:11 +0300