MMDetection——2. Quick start (Chinese official document II)

MMDetection——2. Quick start (translation)

1: Infer and train existing models and standard data sets

  • MMDetection in In Model Zoo It provides hundreds of existing and existing detection models and supports multiple standard data sets, including Pascal, VOC, COCO, CityScapes, LVIS, etc. This note explains how to perform common tasks on these existing models and standard datasets, including:
    • Infer a given image using an existing model.
    • Test existing models on standard datasets.
    • Train predefined models on standard data sets.

Infer existing models

By inference, we mean using trained models to detect objects on images. In MMDetection, the model is defined by the configuration file, and the existing model parameters are saved in the checkpoint file.

First, we recommend using this configuration file configuration file And this checkpoint file checkpoint file To use Faster RCNN . It is recommended to download the checkpoint file to the checkpoints folder of the directory

Advanced API for reasoning

MMDetection provides an advanced Python API to infer images. This is an example of modeling and inference on a given image or video.

from mmdet.apis import init_detector, inference_detector
import mmcv

# Specify the path to model config and checkpoint file
config_file = 'configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py'
checkpoint_file = 'checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth'

# build the model from a config file and a checkpoint file
model = init_detector(config_file, checkpoint_file, device='cuda:0')

# test a single image and show the results
img = 'test.jpg'  # or img = mmcv.imread(img), which will only load it once
result = inference_detector(model, img)
# visualize the results in a new window
model.show_result(img, result)
# or save the visualization results to image files
model.show_result(img, result, out_file='result.jpg')

# test a video and show the results
video = mmcv.VideoReader('video.mp4')
for frame in video:
    result = inference_detector(model, frame)
    model.show_result(frame, result, wait_time=1)

The notebook demo can be downloaded from demo/inference_demo.ipynb Found in.

Note: Information_ The detector only supports single image inference.

Asynchronous interface - supported by Python 3.7 +

For Python 3.7 +, MMDetection also supports asynchronous interfaces. It can provide better utilization of GPDA through CPU binding / GPU. Reasoning can be carried out simultaneously between different input data samples or between different models of a reasoning pipeline.

See tests/async_benchmark.py compares the speed of synchronous and asynchronous interfaces.

import asyncio
import torch
from mmdet.apis import init_detector, async_inference_detector
from mmdet.utils.contextmanagers import concurrent

async def main():
    config_file = 'configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py'
    checkpoint_file = 'checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth'
    device = 'cuda:0'
    model = init_detector(config_file, checkpoint=checkpoint_file, device=device)

    # queue is used for concurrent inference of multiple images
    streamqueue = asyncio.Queue()
    # queue size defines concurrency level
    streamqueue_size = 3

    for _ in range(streamqueue_size):
        streamqueue.put_nowait(torch.cuda.Stream(device=device))

    # test a single image and show the results
    img = 'test.jpg'  # or img = mmcv.imread(img), which will only load it once

    async with concurrent(streamqueue):
        result = await async_inference_detector(model, img)

    # visualize the results in a new window
    model.show_result(img, result)
    # or save the visualization results to image files
    model.show_result(img, result, out_file='result.jpg')


asyncio.run(main())

demonstration edition

We also provide three demo scripts, which are implemented using advanced API s and supported functional code. The source code can be found in here get.

Picture presentation

The script performs reasoning on a single image

python demo/image_demo.py \
    ${IMAGE_FILE} \
    ${CONFIG_FILE} \
    ${CHECKPOINT_FILE} \
    [--device ${GPU_ID}] \
    [--score-thr ${SCORE_THR}]

example:

python demo/image_demo.py demo/demo.jpg \
    configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py \
    checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth \
    --device cpu

Webcam demo

This is a live demonstration from a webcam

python demo/webcam_demo.py \
    ${CONFIG_FILE} \
    ${CHECKPOINT_FILE} \
    [--device ${GPU_ID}] \
    [--camera-id ${CAMERA-ID}] \
    [--score-thr ${SCORE_THR}]

Example s:

python demo/webcam_demo.py \
    configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py \
    checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth

Video presentation

The script performs inference on the video

python demo/video_demo.py \
    ${VIDEO_FILE} \
    ${CONFIG_FILE} \
    ${CHECKPOINT_FILE} \
    [--device ${GPU_ID}] \
    [--score-thr ${SCORE_THR}] \
    [--out ${OUT_FILE}] \
    [--show] \
    [--wait-time ${WAIT_TIME}]

example:

python demo/video_demo.py demo/demo.mp4 \
    configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py \
    checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth \
    --out result.mp4

Test existing models on standard datasets

In order to evaluate the accuracy of the model, the model is usually tested on some standard data sets. MMDetection supports a variety of public data sets, including coco, Pascal, VOC, cityscapes, and more This section will show how to test an existing model on a supported dataset.

Prepare dataset

It can be obtained from the official website or image Pascal VOC Or mirror and COCO Data sets like. Note: in the detection task, Pascal VOC 2012 is an extension of Pascal VOC 2007. There is no overlap. We usually use them together. It is recommended to download the dataset and extract it to a location outside the project directory, and link the dataset root to $MMDETECTION/data, as shown below. If your file structure is different, you may need to change the corresponding path in the configuration file.

mmdetection
├── mmdet
├── tools
├── configs
├── data
│   ├── coco
│   │   ├── annotations
│   │   ├── train2017
│   │   ├── val2017
│   │   ├── test2017
│   ├── cityscapes
│   │   ├── annotations
│   │   ├── leftImg8bit
│   │   │   ├── train
│   │   │   ├── val
│   │   ├── gtFine
│   │   │   ├── train
│   │   │   ├── val
│   ├── VOCdevkit
│   │   ├── VOC2007
│   │   ├── VOC2012

Some models require others COCO-stuff Data sets, such as HTC, DetectoRS and SCNet, can be downloaded and unzipped and then moved to the coco folder. The directory should look like this.

mmdetection
├── data
│   ├── coco
│   │   ├── annotations
│   │   ├── train2017
│   │   ├── val2017
│   │   ├── test2017
│   │   ├── stuffthingmaps

You need to use the following methods to cityscapes Convert comments to coco format tools/dataset_converters/cityscapes.py:

pip install cityscapesscripts

python tools/dataset_converters/cityscapes.py \
    ./data/cityscapes \
    --nproc 8 \
    --out-dir ./data/cityscapes/annotations

To do: change new path

Testing existing models

We provide test scripts to evaluate existing models on the entire dataset (coco, Pascal, VOC, cityscapes, etc.). The following test environments are supported:

  • Single GPU
  • Single node multi GPU
  • Multiple nodes

Select the appropriate script to execute the test according to the test environment.

# single-gpu testing
python tools/test.py \
    ${CONFIG_FILE} \
    ${CHECKPOINT_FILE} \
    [--out ${RESULT_FILE}] \
    [--eval ${EVAL_METRICS}] \
    [--show]

# multi-gpu testing
bash tools/dist_test.sh \
    ${CONFIG_FILE} \
    ${CHECKPOINT_FILE} \
    ${GPU_NUM} \
    [--out ${RESULT_FILE}] \
    [--eval ${EVAL_METRICS}]

tools/dist_test.sh it also supports multi node testing, but relies on PyTorch's launch utility (launch utility)

Optional parameters:

  • RESULT_FILE: the file name of the output result is in pickle format. If not specified, the results will not be saved to the file.
  • EVAL_METRICS: items to be evaluated based on results. The values allowed depend on the dataset, such as proposal_fast, proposal, bbox, segm can be used for COCO, mAP, recall is PASCAL VOC. Cityscapes can be evaluated by cityscapes and all COCO indicators.
  • --show: if specified, the detection results will be drawn on the image and displayed in a new window. It is only applicable to single GPU testing and is used for debugging and visualization. Make sure the GUI is available in your environment. Otherwise, you may encounter similar errors. cannot connect to X server
  • --Show dir: if specified, the detection results will be drawn on the image and saved to the specified directory. It is only applicable to single GPU testing and is used for debugging and visualization. You do not need a GUI available in your environment to use this option.
  • --Show score thr: if specified, tests with scores below this threshold will be deleted.
  • --cfg options: if specified, the key value pair option cfg will be incorporated into the configuration file.
  • --Eval options: if specified, the key value pair optional eval cfg will become a dataset The kwargs of the evaluate() function is only used for evaluation.

example

Suppose you have downloaded checkpoints to the directory checkpoints /

  1. Test fast r-cnn and visualize the results. Press any key to view the next image. Configuration files and checkpoint files here Available.

    python tools/test.py \
        configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py \
        checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth \
        --show
    
  2. Test fast r-cnn and save the drawn image for future visualization. Configuration files and checkpoint files here Available.

    python tools/test.py \
        configs/faster_rcnn/faster_rcnn_r50_fpn_1x.py \
        checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth \
        --show-dir faster_rcnn_r50_fpn_1x_results
    
  3. Test fast r-cnn on PASCAL VOC (without saving test results) and evaluate mAP Configuration files and checkpoint files here Available.

    python tools/test.py \
        configs/pascal_voc/faster_rcnn_r50_fpn_1x_voc.py \
        checkpoints/faster_rcnn_r50_fpn_1x_voc0712_20200624-c9895d40.pth \
        --eval mAP
    
  4. Test Mask R-CNN with 8 GPU s and evaluate bbox and mask AP. Configuration files and checkpoint files here Available.

    ./tools/dist_test.sh \
        configs/mask_rcnn_r50_fpn_1x_coco.py \
        checkpoints/mask_rcnn_r50_fpn_1x_coco_20200205-d4b0c5d6.pth \
        8 \
        --out results.pkl \
        --eval bbox segm
    
  5. Test Mask R-CNN with 8 GPU s and evaluate the classification bbox and mask AP. Configuration files and checkpoint files here Available.

    ./tools/dist_test.sh \
        configs/mask_rcnn/mask_rcnn_r50_fpn_1x_coco.py \
        checkpoints/mask_rcnn_r50_fpn_1x_coco_20200205-d4b0c5d6.pth \
        8 \
        --out results.pkl \
        --eval bbox segm \
        --options "classwise=True"
    
  6. Test Mask R-CNN on coco test dev with 8 GPU s, and generate JSON files to submit to the official evaluation server. Configuration files and checkpoint files here Available.

    ./tools/dist_test.sh \
        configs/mask_rcnn/mask_rcnn_r50_fpn_1x_coco.py \
        checkpoints/mask_rcnn_r50_fpn_1x_coco_20200205-d4b0c5d6.pth \
        8 \
        --format-only \
        --options "jsonfile_prefix=./mask_rcnn_test-dev_results"
    

    This command generates two JSON file masks_ rcnn_ test-dev_ results. bbox. JSON and mask_rcnn_test-dev_results.segm.json.

  7. Test Mask R-CNN with 8 GPU s on Cityscapes, generate txt and png files and submit them to the official evaluation server. Configuration files and checkpoint files here Available.

    ./tools/dist_test.sh \
        configs/cityscapes/mask_rcnn_r50_fpn_1x_cityscapes.py \
        checkpoints/mask_rcnn_r50_fpn_1x_cityscapes_20200227-afe51d5a.pth \
        8 \
        --format-only \
        --options "txtfile_prefix=./mask_rcnn_cityscapes_test_results"
    

    The generated png and txt will be displayed in/ mask_rcnn_cityscapes_test_results directory.

Test without label notes

MMDetection supports the use of CocoDataset to test models without tag annotations. If your dataset format is not COCO format, convert it to COCO format. For example, if your dataset format is VOC, you can use In tools of script Convert it directly to COCO format.

# single-gpu testing
python tools/test.py \
    ${CONFIG_FILE} \
    ${CHECKPOINT_FILE} \
    --format-only \
    --options ${JSONFILE_PREFIX} \
    [--show]

# multi-gpu testing
bash tools/dist_test.sh \
    ${CONFIG_FILE} \
    ${CHECKPOINT_FILE} \
    ${GPU_NUM} \
    --format-only \
    --options ${JSONFILE_PREFIX} \
    [--show]

Assume that model zoo Download the checkpoints in to the directory checkpoints /, we can test Mask R-CNN on coco test dev with 8 GPU s, and use the following command to generate JSON files.

./tools/dist_test.sh \
    configs/mask_rcnn/mask_rcnn_r50_fpn_1x_coco.py \
    checkpoints/mask_rcnn_r50_fpn_1x_coco_20200205-d4b0c5d6.pth \
    8 \
    -format-only \
    --options "jsonfile_prefix=./mask_rcnn_test-dev_results"

This command generates two JSON file masks_ rcnn_ test-dev_ results. bbox. JSON and mask_rcnn_test-dev_results.segm.json.

Batch inference

MMDetection supports reasoning on single images or batch images in test mode. By default, we use single image inference. You can modify samples in the configuration of test data_ per_ GPU to use batch inference. You can modify the configuration as follows.

data = dict(train=dict(...), val=dict(...), test=dict(samples_per_gpu=2, ...))

Or you can set it to -- CFG options data test. samples_ per_ gpu=2

Not recommended ImageToTensor

In the test mode, ImageToTensor pipeline is not recommended. Instead, DefaultFormatBundle is recommended. It is recommended to manually replace the test data pipeline in the configuration file. For example:

# use ImageToTensor (deprecated)
pipelines = [
   dict(type='LoadImageFromFile'),
   dict(
       type='MultiScaleFlipAug',
       img_scale=(1333, 800),
       flip=False,
       transforms=[
           dict(type='Resize', keep_ratio=True),
           dict(type='RandomFlip'),
           dict(type='Normalize', mean=[0, 0, 0], std=[1, 1, 1]),
           dict(type='Pad', size_divisor=32),
           dict(type='ImageToTensor', keys=['img']),
           dict(type='Collect', keys=['img']),
       ])
   ]

# manually replace ImageToTensor to DefaultFormatBundle (recommended)
pipelines = [
   dict(type='LoadImageFromFile'),
   dict(
       type='MultiScaleFlipAug',
       img_scale=(1333, 800),
       flip=False,
       transforms=[
           dict(type='Resize', keep_ratio=True),
           dict(type='RandomFlip'),
           dict(type='Normalize', mean=[0, 0, 0], std=[1, 1, 1]),
           dict(type='Pad', size_divisor=32),
           dict(type='DefaultFormatBundle'),
           dict(type='Collect', keys=['img']),
       ])
   ]

Training predefined models on standard data sets

MMDetection also provides ready-made tools for training detection models. This section will show how to train predefined models on a standard dataset (i.e. COCO) config Next).

Important: the default learning rate in the configuration file is 8 GPUs and 2 img / gpu (batch size = 8 * 2 = 16). according to Linear scaling rule , if you use different GPUs or each GPU processes a different number of pictures, you need to set the learning rate according to the batch size. For example, lr=0.01 for 4 GPUs * 2 IMGs / GPU and lr=0.08 for 16 GPUs * 4 IMGs / GPU.

Prepare dataset

Training also requires the preparation of data sets. For more information, see Prepare datasets

Note: Currently, configs/cityscapes under the configuration file are initialized with the weight of COCO pre training. If the network connection is unavailable or slow, you can download the existing model in advance. Otherwise, it will cause an error at the beginning of the training.

Train on a single GPU

We provide tools / train Py information about starting training on a single GPU. The basic usage is as follows.

python tools/train.py \
    ${CONFIG_FILE} \
    [optional arguments]

During training, log files and checkpoints will be saved to the working directory, which is called work_dir specify -- work dir in the config file or through CLI parameters.

By default, the model evaluates the validation set after each iteration, and the evaluation interval can be specified in the configuration file, as shown below.

# evaluate the model every 12 epoch.
evaluation = dict(interval=12)
  • The tool accepts several optional parameters, including:
    • --No validate: disables evaluation during training.
    • --Work dir ${work_dir}: overwrite the working directory.
    • --Resume from ${checkpoint_file}: recover from the previous checkpoint file.
    • --Options' key = value ': overrides other settings in the configuration used.

matters needing attention:

The difference between resume from and load from:

Resume from loads both model weights and optimizer state, and the iteration also inherits from the specified checkpoint. It is usually used to resume an unexpectedly interrupted training process. Load from only loads model weights, and the training period starts from 0. Usually used for fine tuning.

Training on multiple GPU s

We provide tools/dist_train.sh method of starting training on multiple GPU s. The basic usage is as follows.

bash ./tools/dist_train.sh \
    ${CONFIG_FILE} \
    ${GPU_NUM} \
    [optional arguments]

Optional parameters and above Same as above.

Start multiple jobs at the same time

If you want to start multiple jobs on one computer, for example, two 4-GPU training jobs on a computer with 8 GPUs, you need to specify a different port for each job (29500 by default) to avoid communication conflicts.

If dist_train.sh is used to start the training job, you can set the port in the command.

CUDA_VISIBLE_DEVICES=0,1,2,3 PORT=29500 ./tools/dist_train.sh ${CONFIG_FILE} 4
CUDA_VISIBLE_DEVICES=4,5,6,7 PORT=29501 ./tools/dist_train.sh ${CONFIG_FILE} 4

Training on multiple nodes

MMDetection Distributed software package for distributed training. Therefore, as a basic usage, you can use PyTorch's Launch utility Start distributed training.

Manage jobs with Slurm

Managing positions using Slurm

Slurm It is a good job scheduling system for computing clusters. On a cluster managed by Slurm, you can slurm_train.sh is used to generate training assignments. It supports single node and multi node training.

The basic usage is as follows.

[GPUS=${GPUS}] ./tools/slurm_train.sh ${PARTITION} ${JOB_NAME} ${CONFIG_FILE} ${WORK_DIR}

The following is an example of using 16 GPU s to train Mask R-CNN on a Slurm partition named dev and set the working directory to some shared file systems.

GPUS=16 ./tools/slurm_train.sh dev mask_r50_1x configs/mask_rcnn_r50_fpn_1x_coco.py /nfs/xxxx/mask_rcnn_r50_fpn_1x

You can check source code To view the complete parameters and environment variables.

When using Slurm, you need to set the port options in one of the following ways:

  1. By setting the port -- options. This is recommended because it does not change the original configuration.
CUDA_VISIBLE_DEVICES=0,1,2,3 GPUS=4 ./tools/slurm_train.sh ${PARTITION} ${JOB_NAME} config1.py ${WORK_DIR} --options 'dist_params.port=29500'
CUDA_VISIBLE_DEVICES=4,5,6,7 GPUS=4 ./tools/slurm_train.sh ${PARTITION} ${JOB_NAME} config2.py ${WORK_DIR} --options 'dist_params.port=29501'

2. Modify the configuration file to set different communication ports.

In config1 Py, set

dist_params = dict(backend='nccl', port=29500)

In config2 Py, set

dist_params = dict(backend='nccl', port=29501)

You can then use config1 Py and config2 Py starts two jobs.

CUDA_VISIBLE_DEVICES=0,1,2,3 GPUS=4 ./tools/slurm_train.sh ${PARTITION} ${JOB_NAME} config1.py ${WORK_DIR}
CUDA_VISIBLE_DEVICES=4,5,6,7 GPUS=4 ./tools/slurm_train.sh ${PARTITION} ${JOB_NAME} config2.py ${WORK_DIR}

Keywords: Python Computer Vision Deep Learning

Added by michaeru on Fri, 04 Mar 2022 01:11:02 +0200