MMdetection official Chinese document 1: reasoning on standard data sets using existing models
MMDetection in Model Zoo It provides hundreds of detection models and supports a variety of standard data sets, including Pascal, VOC, COCO, Cityscapes, LVIS, etc. This document will describe how to use these models and standard data sets to run some common tasks, including:
- Reasoning on a given picture using an existing model
- Test existing models on standard datasets
- Training predefined models on standard data sets
Reasoning using existing models
Reasoning refers to using the trained model to detect the target on the image. In MMDetection, a model is defined as a set of configuration files and corresponding model parameters stored in the checkpoint file.
First, we suggest starting from Faster RCNN Start, its to configure Documents and checkpoint The file is here.
We recommend downloading the checkpoint file to the checkpoints folder.
High level programming interface for reasoning
MMDetection provides Python's high-level programming interface for reasoning on pictures. The following are examples of modeling and reasoning on images or videos.
from mmdet.apis import init_detector, inference_detector import mmcv # Specify the configuration file and checkpoint file path of the model config_file = 'configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py' checkpoint_file = 'checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth' # Build the model according to the configuration file and checkpoint file model = init_detector(config_file, checkpoint_file, device='cuda:0') # Test a single picture and show the results img = 'test.jpg' # Or img = mmcv.imread(img), so the picture will be read only once result = inference_detector(model, img) # Visualize the results in a new window model.show_result(img, result) # Or save the visualization results as pictures model.show_result(img, result, out_file='result.jpg') # Test the video and show the results video = mmcv.VideoReader('video.mp4') for frame in video: result = inference_detector(model, frame) model.show_result(frame, result, wait_time=1)
A demo sample on the Jupiter notebook is shown in demo/inference_demo.ipynb .
Asynchronous interface - supports Python 3.7+
For Python 3.7 +, MMDetection also has an asynchronous interface. Using CUDA stream, the reasoning code bound to GPU will not block CPU, so that CPU/GPU can achieve higher utilization in single threaded applications. In the reasoning process, the reasoning of different data samples and different models can run concurrently.
You can refer to tests/async_benchmark.py to compare the running speed of synchronous interface and asynchronous interface.
import asyncio import torch from mmdet.apis import init_detector, async_inference_detector from mmdet.utils.contextmanagers import concurrent async def main(): config_file = 'configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py' checkpoint_file = 'checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth' device = 'cuda:0' model = init_detector(config_file, checkpoint=checkpoint_file, device=device) # This queue is used to infer multiple images in parallel streamqueue = asyncio.Queue() # The queue size defines the number of parallels streamqueue_size = 3 for _ in range(streamqueue_size): streamqueue.put_nowait(torch.cuda.Stream(device=device)) # Test a single picture and show the results img = 'test.jpg' # Or or img = mmcv.imread(img), so the picture will be read only once async with concurrent(streamqueue): result = await async_inference_detector(model, img) # Visualize the results in a new window model.show_result(img, result) # Or save the visualization results as pictures model.show_result(img, result, out_file='result.jpg') asyncio.run(main())
Demo sample
We also provide three demo scripts, which are implemented using high-level programming interfaces. Source code here .
Image sample
This is a script for reasoning on a single picture. You can start -- async test for asynchronous reasoning.
python demo/image_demo.py \ ${IMAGE_FILE} \ ${CONFIG_FILE} \ ${CHECKPOINT_FILE} \ [--device ${GPU_ID}] \ [--score-thr ${SCORE_THR}] \ [--async-test]
Run example:
python demo/image_demo.py demo/demo.jpg \ configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py \ checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth \ --device cpu
Camera example
This is a reasoning script using real-time pictures of the camera.
python demo/webcam_demo.py \ ${CONFIG_FILE} \ ${CHECKPOINT_FILE} \ [--device ${GPU_ID}] \ [--camera-id ${CAMERA-ID}] \ [--score-thr ${SCORE_THR}]
Run example:
python demo/webcam_demo.py \ configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py \ checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth
Video sample
This is a script for reasoning on the video sample.
python demo/video_demo.py \ ${VIDEO_FILE} \ ${CONFIG_FILE} \ ${CHECKPOINT_FILE} \ [--device ${GPU_ID}] \ [--score-thr ${SCORE_THR}] \ [--out ${OUT_FILE}] \ [--show] \ [--wait-time ${WAIT_TIME}]
Run example:
python demo/video_demo.py demo/demo.mp4 \ configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py \ checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth \ --out result.mp4
Test existing models on standard datasets
In order to test the accuracy of a model, we usually test it on a standard data set. MMDetection supports multiple public data sets, including COCO ,
Pascal VOC ,Cityscapes wait.
This section describes how to test an existing model on a supported dataset.
Dataset preparation
Some public data sets, such as Pascal VOC and its image data sets, or COCO data sets, can be obtained from official websites or image websites.
Note: in the detection task, Pascal VOC 2012 is a non intersection extension of Pascal VOC 2007. We usually use the two together.
We recommend downloading the dataset, extracting it to a folder outside the project, and then linking the root directory of the dataset to the $MMDETECTION/data folder through symbolic link. The format is as follows.
If your folder structure is different from that below, you need to change the corresponding path in the configuration file.
mmdetection ├── mmdet ├── tools ├── configs ├── data │ ├── coco │ │ ├── annotations │ │ ├── train2017 │ │ ├── val2017 │ │ ├── test2017 │ ├── cityscapes │ │ ├── annotations │ │ ├── leftImg8bit │ │ │ ├── train │ │ │ ├── val │ │ ├── gtFine │ │ │ ├── train │ │ │ ├── val │ ├── VOCdevkit │ │ ├── VOC2007 │ │ ├── VOC2012
Some models require additional COCO-stuff Data sets, such as HTC, DetectoRS and SCNet, can be downloaded and unzipped into the coco folder. The folder will be structured as follows:
mmdetection ├── data │ ├── coco │ │ ├── annotations │ │ ├── train2017 │ │ ├── val2017 │ │ ├── test2017 │ │ ├── stuffthingmaps
Panoramic segmentation models such as PanopticFPN require additional COCO Panoptic Data sets, you can download and unzip them to the coco/annotations folder. The folder will be structured as follows:
mmdetection ├── data │ ├── coco │ │ ├── annotations │ │ │ ├── panoptic_train2017.json │ │ │ ├── panoptic_train2017 │ │ │ ├── panoptic_val2017.json │ │ │ ├── panoptic_val2017 │ │ ├── train2017 │ │ ├── val2017 │ │ ├── test2017
The annotation format of Cityscape dataset needs to be converted to be consistent with the annotation format of COCO dataset. Use tools/dataset_converters/cityscapes.py to complete the conversion:
pip install cityscapesscripts python tools/dataset_converters/cityscapes.py \ ./data/cityscapes \ --nproc 8 \ --out-dir ./data/cityscapes/annotations
Testing existing models
We provide test scripts that can test the performance of an existing model on all data sets (COCO, Pascal, VOC, Cityscapes, etc.). We support testing in the following environments:
- Single GPU test
- Single node multi GPU test
- Multi node test
According to the above test environment, select the appropriate script to execute the test process.
# Single GPU test python tools/test_Registry.py \ ${CONFIG_FILE} \ ${CHECKPOINT_FILE} \ [--out ${RESULT_FILE}] \ [--eval ${EVAL_METRICS}] \ [--show] # Single node multi GPU test bash tools/dist_test.sh \ ${CONFIG_FILE} \ ${CHECKPOINT_FILE} \ ${GPU_NUM} \ [--out ${RESULT_FILE}] \ [--eval ${EVAL_METRICS}]
tools/dist_test.sh also supports multi node testing, but it depends on PyTorch Launch tool .
Optional parameters:
- RESULT_FILE: the name of the result file, which needs to be stored in the form of. pkl. If there is no declaration, the results are not stored in a file.
- EVAL_METRICS: metrics to be tested. The optional values depend on the data set, such as proposal_fast, proposal, bbox and segm are optional values of COCO dataset, and mAP and recall are optional values of Pascal VOC dataset. The cityscapes dataset can test the metrics supported by cityscapes and all COCO datasets.
- --show: if enabled, the detection results will be drawn on the image and displayed in a new window. It is only applicable to the test of single GPU and is used for debugging and visualization. Make sure your GUI can be opened in the environment when using this feature. Otherwise, you may encounter an error cannot connect to X server.
- --Show dir: if specified, the detection results will be drawn on the image and saved to the specified directory. It is only applicable to the test of single GPU and is used for debugging and visualization. This option is available even if there is no GUI in your environment.
- --Show score thr: if specified, detection results with scores below this threshold will be removed.
- --CFG options: if specified, the key value pairs here will be merged into the configuration file.
- --Eval options: if specified, the key value pair here will be passed into the dataset. Evaluation() function as a dictionary parameter, which is only used in the test phase.
Sample
Suppose you have downloaded the checkpoint file to the checkpoints / file.
-
Test fast r-cnn and visualize its results. Press any key to continue the test of the next picture. Configuration files and checkpoint files Here .
python tools/test_Registry.py \ configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py \ checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth \ --show
-
Test fast r-cnn and save the drawn image for later visualization. Configuration files and checkpoint files Here .
python tools/test_Registry.py \ configs/faster_rcnn/faster_rcnn_r50_fpn_1x.py \ checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth \ --show-dir faster_rcnn_r50_fpn_1x_results
-
Test fast r-cnn on Pascal VOC dataset without saving test results and test mAP. Configuration files and checkpoint files Here .
python tools/test_Registry.py \ configs/pascal_voc/faster_rcnn_r50_fpn_1x_voc.py \ checkpoints/faster_rcnn_r50_fpn_1x_voc0712_20200624-c9895d40.pth \ --eval mAP
-
Use 8 GPU s to test Mask R-CNN, bbox and mAP. Configuration files and checkpoint files Here .
./tools/dist_test.sh \ configs/mask_rcnn_r50_fpn_1x_coco.py \ checkpoints/mask_rcnn_r50_fpn_1x_coco_20200205-d4b0c5d6.pth \ 8 \ --out results.pkl \ --eval bbox segm
-
Test Mask R-CNN with 8 GPU s to test bbox and mAP of each type. Configuration files and checkpoint files Here .
./tools/dist_test.sh \ configs/mask_rcnn/mask_rcnn_r50_fpn_1x_coco.py \ checkpoints/mask_rcnn_r50_fpn_1x_coco_20200205-d4b0c5d6.pth \ 8 \ --out results.pkl \ --eval bbox segm \ --options "classwise=True"
-
On the coco test dev dataset, use 8 GPU s to test Mask R-CNN, generate JSON files and submit them to the official evaluation server. Configuration files and checkpoint files Here .
./tools/dist_test.sh \ configs/mask_rcnn/mask_rcnn_r50_fpn_1x_coco.py \ checkpoints/mask_rcnn_r50_fpn_1x_coco_20200205-d4b0c5d6.pth \ 8 \ --format-only \ --options "jsonfile_prefix=./mask_rcnn_test-dev_results"
This command generates two JSON file masks_ rcnn_ test-dev_ Results.bbox.json and mask_rcnn_test-dev_results.segm.json.
-
On the Cityscapes dataset, 8 GPU s are used to test Mask R-CNN, generate txt and png files, and upload them to the official evaluation server. Configuration files and checkpoint files Here .
./tools/dist_test.sh \ configs/cityscapes/mask_rcnn_r50_fpn_1x_cityscapes.py \ checkpoints/mask_rcnn_r50_fpn_1x_cityscapes_20200227-afe51d5a.pth \ 8 \ --format-only \ --options "txtfile_prefix=./mask_rcnn_cityscapes_test_results"
The generated png and txt files are in. / mask_rcnn_cityscapes_test_results folder.
Do not use the Ground Truth dimension for testing
MMDetection supports testing the model without using ground truth annotation, which requires CocoDataset. If your dataset format is not COCO format, please convert it to COCO format. If your dataset format is VOC or Cityscapes, you can use tools/dataset_converters The script in converts it directly to COCO format. For other formats, you can use Images2cco script Convert.
python tools/dataset_converters/images2coco.py \ ${IMG_PATH} \ ${CLASSES} \ ${OUT} \ [--exclude-extensions]
Parameters:
- IMG_PATH: the root path of the picture.
- CLASSES: class list text file name. Each line in the text stores a category.
- OUT: output json file name. Default save directory and IMG_PATH is at the same level.
- Exclude extensions: file suffix to be excluded.
After the conversion, use the following command to test
# Single GPU test python tools/test_Registry.py \ ${CONFIG_FILE} \ ${CHECKPOINT_FILE} \ --format-only \ --options ${JSONFILE_PREFIX} \ [--show] # Single node multi GPU test bash tools/dist_test.sh \ ${CONFIG_FILE} \ ${CHECKPOINT_FILE} \ ${GPU_NUM} \ --format-only \ --options ${JSONFILE_PREFIX} \ [--show]
hypothesis model zoo The checkpoint file in is downloaded to the checkpoints / folder,
We can use the following command to test Mask R-CNN on the coco test dev dataset with 8 GPU s and generate JSON files.
./tools/dist_test.sh \ configs/mask_rcnn/mask_rcnn_r50_fpn_1x_coco.py \ checkpoints/mask_rcnn_r50_fpn_1x_coco_20200205-d4b0c5d6.pth \ 8 \ -format-only \ --options "jsonfile_prefix=./mask_rcnn_test-dev_results"
This command generates two JSON file masks_ rcnn_ test-dev_ Results.bbox.json and mask_rcnn_test-dev_results.segm.json.
Batch reasoning
In the test mode, MMDetection supports both single image reasoning and batch image reasoning. By default, we use a single picture test. You can modify the samples in the test data configuration file_ per_ GPU to start batch testing.
The modification method of the configuration file for batch reasoning is as follows:
data = dict(train=dict(...), val=dict(...), test=dict(samples_per_gpu=2, ...))
Or you can set -- CFG options to -- CFG options data.test.samples_ per_ GPU = 2 to turn it on.
Discard ImageToTensor
In test mode, discard the ImageToTensor process and replace it with DefaultFormatBundle. It is recommended to manually replace it in the configuration file of your test data process, such as:
# (deprecated) use ImageToTensor pipelines = [ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(1333, 800), flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict(type='Normalize', mean=[0, 0, 0], std=[1, 1, 1]), dict(type='Pad', size_divisor=32), dict(type='ImageToTensor', keys=['img']), dict(type='Collect', keys=['img']), ]) ] # (recommended) manually replace ImageToTensor with DefaultFormatBundle pipelines = [ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(1333, 800), flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict(type='Normalize', mean=[0, 0, 0], std=[1, 1, 1]), dict(type='Pad', size_divisor=32), dict(type='DefaultFormatBundle'), dict(type='Collect', keys=['img']), ]) ]
Training predefined models on standard data sets
MMDetection also provides a ready to eat tool for training and detection models. This section will show how to train a predefined model on a standard dataset such as COCO.
Important information: batchsize = number of GPU * samples_per_gpu (in 'configurations / _base_ / datasets /... py'), the learning rate in the configuration file is set when there are 8 GPUs and each GPU has 2 images (the batch size is 8 * 2 = 16).
according to Linear extension rule , if you use a different number of GPUs or have a different number of pictures on each GPU, you need to set the learning rate to be proportional to the batch size, for example,
Set lr=0.01 when there are 4 GPUs and there are 2 pictures on each GPU; When there are 16 GPUs and there are 4 pictures on each GPU, lr=0.08 is set.
data set
Data sets need to be prepared for training. Please refer to for details Dataset preparation .
be careful:
At present, the configuration files in the configurations / cityscapes folder are initialized with COCO pre training weights. If the network connection is unavailable or slow, you can download the existing model in advance. Otherwise, errors may occur at the beginning of training.
Training with a single GPU
We provide tools/train.py to start training tasks on a single GPU. The basic usage is as follows:
python tools/train.py \ ${CONFIG_FILE} \ [optional arguments]
During the training, the log file and checkpoint file will be saved in the working directory. It needs to pass the work in the configuration file_ Dir or -- work dir in the CLI parameter.
By default, the model will be tested on the validation set after each round of training. The test frequency can be specified by setting the configuration file:
# Test evaluation is conducted every 12 iterations evaluation = dict(interval=12)
This tool accepts the following parameters:
- --No validate: close the test during training
- --Work dir ${work_dir}: overwrite the working directory
- --Resume from ${checkpoint_file}: continue training from a checkpoint file
- --Options' key = value ': overrides other settings in the configuration file used
be careful:
The difference between resume from and load from:
Resume from not only loads the weight of the model and the status of the optimizer, but also inherits the number of iterations of the specified checkpoint and will not restart the training. Load from only loads the weight of the model. Its training starts from scratch and is often used to fine tune the model.
Training on multiple GPU s
We provide tools/dist_train.sh to start training on multiple GPU s. The basic usage is as follows:
bash ./tools/dist_train.sh \ ${CONFIG_FILE} \ ${GPU_NUM} \ [optional arguments]
The optional parameters are consistent with those described in the previous section.
Start multiple tasks at the same time
If you want to start multiple tasks on one machine, such as starting two tasks requiring 4 GPUs on a machine with 8 GPUs, you need to specify different ports (29500 by default) for different training tasks to avoid conflict.
If you use dist_train.sh to start the training task. You can use the command to set the port.
CUDA_VISIBLE_DEVICES=0,1,2,3 PORT=29500 ./tools/dist_train.sh ${CONFIG_FILE} 4 CUDA_VISIBLE_DEVICES=4,5,6,7 PORT=29501 ./tools/dist_train.sh ${CONFIG_FILE} 4
Training on multiple nodes
MMDetection relies on torch.distributed package for distributed training. Therefore, we can use PyTorch's Launch tool For basic use.
Use Slurm to manage tasks
Slurm is a common computing cluster scheduling system. On the cluster managed by slurm, you can use slurm.sh to start training tasks. It supports both single node training and multi node training.
The basic usage is as follows:
[GPUS=${GPUS}] ./tools/slurm_train.sh ${PARTITION} ${JOB_NAME} ${CONFIG_FILE} ${WORK_DIR}
The following is an example of using 16 GPU s to train Mask R-CNN on a Slurm partition named dev, and setting work dir under some shared file systems.
GPUS=16 ./tools/slurm_train.sh dev mask_r50_1x configs/mask_rcnn_r50_fpn_1x_coco.py /nfs/xxxx/mask_rcnn_r50_fpn_1x
You can view Source code To check all parameters and environment variables
When using Slurm, the port needs to be set by one of the following methods.
-
Set the port through -- options. We highly recommend this approach because it does not require changing the original configuration file.
CUDA_VISIBLE_DEVICES=0,1,2,3 GPUS=4 ./tools/slurm_train.sh ${PARTITION} ${JOB_NAME} config1.py ${WORK_DIR} --options 'dist_params.port=29500' CUDA_VISIBLE_DEVICES=4,5,6,7 GPUS=4 ./tools/slurm_train.sh ${PARTITION} ${JOB_NAME} config2.py ${WORK_DIR} --options 'dist_params.port=29501'
-
Modify the configuration file to set different AC ports.
In config1.py, set:
dist_params = dict(backend='nccl', port=29500)
In config2.py, set:
dist_params = dict(backend='nccl', port=29501)
Then you can use config1.py and config2.py to start the two tasks.
CUDA_VISIBLE_DEVICES=0,1,2,3 GPUS=4 ./tools/slurm_train.sh ${PARTITION} ${JOB_NAME} config1.py ${WORK_DIR} CUDA_VISIBLE_DEVICES=4,5,6,7 GPUS=4 ./tools/slurm_train.sh ${PARTITION} ${JOB_NAME} config2.py ${WORK_DIR}