MMDetection——2. Quick start (translation)
1: Infer and train existing models and standard data sets
- MMDetection in In Model Zoo It provides hundreds of existing and existing detection models and supports multiple standard data sets, including Pascal, VOC, COCO, CityScapes, LVIS, etc. This note explains how to perform common tasks on these existing models and standard datasets, including:
- Infer a given image using an existing model.
- Test existing models on standard datasets.
- Train predefined models on standard data sets.
Infer existing models
By inference, we mean using trained models to detect objects on images. In MMDetection, the model is defined by the configuration file, and the existing model parameters are saved in the checkpoint file.
First, we recommend using this configuration file configuration file And this checkpoint file checkpoint file To use Faster RCNN . It is recommended to download the checkpoint file to the checkpoints folder of the directory
Advanced API for reasoning
MMDetection provides an advanced Python API to infer images. This is an example of modeling and inference on a given image or video.
from mmdet.apis import init_detector, inference_detector import mmcv # Specify the path to model config and checkpoint file config_file = 'configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py' checkpoint_file = 'checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth' # build the model from a config file and a checkpoint file model = init_detector(config_file, checkpoint_file, device='cuda:0') # test a single image and show the results img = 'test.jpg' # or img = mmcv.imread(img), which will only load it once result = inference_detector(model, img) # visualize the results in a new window model.show_result(img, result) # or save the visualization results to image files model.show_result(img, result, out_file='result.jpg') # test a video and show the results video = mmcv.VideoReader('video.mp4') for frame in video: result = inference_detector(model, frame) model.show_result(frame, result, wait_time=1)
The notebook demo can be downloaded from demo/inference_demo.ipynb Found in.
Note: Information_ The detector only supports single image inference.
Asynchronous interface - supported by Python 3.7 +
For Python 3.7 +, MMDetection also supports asynchronous interfaces. It can provide better utilization of GPDA through CPU binding / GPU. Reasoning can be carried out simultaneously between different input data samples or between different models of a reasoning pipeline.
See tests/async_benchmark.py compares the speed of synchronous and asynchronous interfaces.
import asyncio import torch from mmdet.apis import init_detector, async_inference_detector from mmdet.utils.contextmanagers import concurrent async def main(): config_file = 'configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py' checkpoint_file = 'checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth' device = 'cuda:0' model = init_detector(config_file, checkpoint=checkpoint_file, device=device) # queue is used for concurrent inference of multiple images streamqueue = asyncio.Queue() # queue size defines concurrency level streamqueue_size = 3 for _ in range(streamqueue_size): streamqueue.put_nowait(torch.cuda.Stream(device=device)) # test a single image and show the results img = 'test.jpg' # or img = mmcv.imread(img), which will only load it once async with concurrent(streamqueue): result = await async_inference_detector(model, img) # visualize the results in a new window model.show_result(img, result) # or save the visualization results to image files model.show_result(img, result, out_file='result.jpg') asyncio.run(main())
demonstration edition
We also provide three demo scripts, which are implemented using advanced API s and supported functional code. The source code can be found in here get.
Picture presentation
The script performs reasoning on a single image
python demo/image_demo.py \ ${IMAGE_FILE} \ ${CONFIG_FILE} \ ${CHECKPOINT_FILE} \ [--device ${GPU_ID}] \ [--score-thr ${SCORE_THR}]
example:
python demo/image_demo.py demo/demo.jpg \ configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py \ checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth \ --device cpu
Webcam demo
This is a live demonstration from a webcam
python demo/webcam_demo.py \ ${CONFIG_FILE} \ ${CHECKPOINT_FILE} \ [--device ${GPU_ID}] \ [--camera-id ${CAMERA-ID}] \ [--score-thr ${SCORE_THR}]
Example s:
python demo/webcam_demo.py \ configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py \ checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth
Video presentation
The script performs inference on the video
python demo/video_demo.py \ ${VIDEO_FILE} \ ${CONFIG_FILE} \ ${CHECKPOINT_FILE} \ [--device ${GPU_ID}] \ [--score-thr ${SCORE_THR}] \ [--out ${OUT_FILE}] \ [--show] \ [--wait-time ${WAIT_TIME}]
example:
python demo/video_demo.py demo/demo.mp4 \ configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py \ checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth \ --out result.mp4
Test existing models on standard datasets
In order to evaluate the accuracy of the model, the model is usually tested on some standard data sets. MMDetection supports a variety of public data sets, including coco, Pascal, VOC, cityscapes, and more This section will show how to test an existing model on a supported dataset.
Prepare dataset
It can be obtained from the official website or image Pascal VOC Or mirror and COCO Data sets like. Note: in the detection task, Pascal VOC 2012 is an extension of Pascal VOC 2007. There is no overlap. We usually use them together. It is recommended to download the dataset and extract it to a location outside the project directory, and link the dataset root to $MMDETECTION/data, as shown below. If your file structure is different, you may need to change the corresponding path in the configuration file.
mmdetection ├── mmdet ├── tools ├── configs ├── data │ ├── coco │ │ ├── annotations │ │ ├── train2017 │ │ ├── val2017 │ │ ├── test2017 │ ├── cityscapes │ │ ├── annotations │ │ ├── leftImg8bit │ │ │ ├── train │ │ │ ├── val │ │ ├── gtFine │ │ │ ├── train │ │ │ ├── val │ ├── VOCdevkit │ │ ├── VOC2007 │ │ ├── VOC2012
Some models require others COCO-stuff Data sets, such as HTC, DetectoRS and SCNet, can be downloaded and unzipped and then moved to the coco folder. The directory should look like this.
mmdetection ├── data │ ├── coco │ │ ├── annotations │ │ ├── train2017 │ │ ├── val2017 │ │ ├── test2017 │ │ ├── stuffthingmaps
You need to use the following methods to cityscapes Convert comments to coco format tools/dataset_converters/cityscapes.py:
pip install cityscapesscripts python tools/dataset_converters/cityscapes.py \ ./data/cityscapes \ --nproc 8 \ --out-dir ./data/cityscapes/annotations
To do: change new path
Testing existing models
We provide test scripts to evaluate existing models on the entire dataset (coco, Pascal, VOC, cityscapes, etc.). The following test environments are supported:
- Single GPU
- Single node multi GPU
- Multiple nodes
Select the appropriate script to execute the test according to the test environment.
# single-gpu testing python tools/test.py \ ${CONFIG_FILE} \ ${CHECKPOINT_FILE} \ [--out ${RESULT_FILE}] \ [--eval ${EVAL_METRICS}] \ [--show] # multi-gpu testing bash tools/dist_test.sh \ ${CONFIG_FILE} \ ${CHECKPOINT_FILE} \ ${GPU_NUM} \ [--out ${RESULT_FILE}] \ [--eval ${EVAL_METRICS}]
tools/dist_test.sh it also supports multi node testing, but relies on PyTorch's launch utility (launch utility)
Optional parameters:
- RESULT_FILE: the file name of the output result is in pickle format. If not specified, the results will not be saved to the file.
- EVAL_METRICS: items to be evaluated based on results. The values allowed depend on the dataset, such as proposal_fast, proposal, bbox, segm can be used for COCO, mAP, recall is PASCAL VOC. Cityscapes can be evaluated by cityscapes and all COCO indicators.
- --show: if specified, the detection results will be drawn on the image and displayed in a new window. It is only applicable to single GPU testing and is used for debugging and visualization. Make sure the GUI is available in your environment. Otherwise, you may encounter similar errors. cannot connect to X server
- --Show dir: if specified, the detection results will be drawn on the image and saved to the specified directory. It is only applicable to single GPU testing and is used for debugging and visualization. You do not need a GUI available in your environment to use this option.
- --Show score thr: if specified, tests with scores below this threshold will be deleted.
- --cfg options: if specified, the key value pair option cfg will be incorporated into the configuration file.
- --Eval options: if specified, the key value pair optional eval cfg will become a dataset The kwargs of the evaluate() function is only used for evaluation.
example
Suppose you have downloaded checkpoints to the directory checkpoints /
-
Test fast r-cnn and visualize the results. Press any key to view the next image. Configuration files and checkpoint files here Available.
python tools/test.py \ configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py \ checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth \ --show
-
Test fast r-cnn and save the drawn image for future visualization. Configuration files and checkpoint files here Available.
python tools/test.py \ configs/faster_rcnn/faster_rcnn_r50_fpn_1x.py \ checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth \ --show-dir faster_rcnn_r50_fpn_1x_results
-
Test fast r-cnn on PASCAL VOC (without saving test results) and evaluate mAP Configuration files and checkpoint files here Available.
python tools/test.py \ configs/pascal_voc/faster_rcnn_r50_fpn_1x_voc.py \ checkpoints/faster_rcnn_r50_fpn_1x_voc0712_20200624-c9895d40.pth \ --eval mAP
-
Test Mask R-CNN with 8 GPU s and evaluate bbox and mask AP. Configuration files and checkpoint files here Available.
./tools/dist_test.sh \ configs/mask_rcnn_r50_fpn_1x_coco.py \ checkpoints/mask_rcnn_r50_fpn_1x_coco_20200205-d4b0c5d6.pth \ 8 \ --out results.pkl \ --eval bbox segm
-
Test Mask R-CNN with 8 GPU s and evaluate the classification bbox and mask AP. Configuration files and checkpoint files here Available.
./tools/dist_test.sh \ configs/mask_rcnn/mask_rcnn_r50_fpn_1x_coco.py \ checkpoints/mask_rcnn_r50_fpn_1x_coco_20200205-d4b0c5d6.pth \ 8 \ --out results.pkl \ --eval bbox segm \ --options "classwise=True"
-
Test Mask R-CNN on coco test dev with 8 GPU s, and generate JSON files to submit to the official evaluation server. Configuration files and checkpoint files here Available.
./tools/dist_test.sh \ configs/mask_rcnn/mask_rcnn_r50_fpn_1x_coco.py \ checkpoints/mask_rcnn_r50_fpn_1x_coco_20200205-d4b0c5d6.pth \ 8 \ --format-only \ --options "jsonfile_prefix=./mask_rcnn_test-dev_results"
This command generates two JSON file masks_ rcnn_ test-dev_ results. bbox. JSON and mask_rcnn_test-dev_results.segm.json.
-
Test Mask R-CNN with 8 GPU s on Cityscapes, generate txt and png files and submit them to the official evaluation server. Configuration files and checkpoint files here Available.
./tools/dist_test.sh \ configs/cityscapes/mask_rcnn_r50_fpn_1x_cityscapes.py \ checkpoints/mask_rcnn_r50_fpn_1x_cityscapes_20200227-afe51d5a.pth \ 8 \ --format-only \ --options "txtfile_prefix=./mask_rcnn_cityscapes_test_results"
The generated png and txt will be displayed in/ mask_rcnn_cityscapes_test_results directory.
Test without label notes
MMDetection supports the use of CocoDataset to test models without tag annotations. If your dataset format is not COCO format, convert it to COCO format. For example, if your dataset format is VOC, you can use In tools of script Convert it directly to COCO format.
# single-gpu testing python tools/test.py \ ${CONFIG_FILE} \ ${CHECKPOINT_FILE} \ --format-only \ --options ${JSONFILE_PREFIX} \ [--show] # multi-gpu testing bash tools/dist_test.sh \ ${CONFIG_FILE} \ ${CHECKPOINT_FILE} \ ${GPU_NUM} \ --format-only \ --options ${JSONFILE_PREFIX} \ [--show]
Assume that model zoo Download the checkpoints in to the directory checkpoints /, we can test Mask R-CNN on coco test dev with 8 GPU s, and use the following command to generate JSON files.
./tools/dist_test.sh \ configs/mask_rcnn/mask_rcnn_r50_fpn_1x_coco.py \ checkpoints/mask_rcnn_r50_fpn_1x_coco_20200205-d4b0c5d6.pth \ 8 \ -format-only \ --options "jsonfile_prefix=./mask_rcnn_test-dev_results"
This command generates two JSON file masks_ rcnn_ test-dev_ results. bbox. JSON and mask_rcnn_test-dev_results.segm.json.
Batch inference
MMDetection supports reasoning on single images or batch images in test mode. By default, we use single image inference. You can modify samples in the configuration of test data_ per_ GPU to use batch inference. You can modify the configuration as follows.
data = dict(train=dict(...), val=dict(...), test=dict(samples_per_gpu=2, ...))
Or you can set it to -- CFG options data test. samples_ per_ gpu=2
Not recommended ImageToTensor
In the test mode, ImageToTensor pipeline is not recommended. Instead, DefaultFormatBundle is recommended. It is recommended to manually replace the test data pipeline in the configuration file. For example:
# use ImageToTensor (deprecated) pipelines = [ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(1333, 800), flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict(type='Normalize', mean=[0, 0, 0], std=[1, 1, 1]), dict(type='Pad', size_divisor=32), dict(type='ImageToTensor', keys=['img']), dict(type='Collect', keys=['img']), ]) ] # manually replace ImageToTensor to DefaultFormatBundle (recommended) pipelines = [ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(1333, 800), flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict(type='Normalize', mean=[0, 0, 0], std=[1, 1, 1]), dict(type='Pad', size_divisor=32), dict(type='DefaultFormatBundle'), dict(type='Collect', keys=['img']), ]) ]
Training predefined models on standard data sets
MMDetection also provides ready-made tools for training detection models. This section will show how to train predefined models on a standard dataset (i.e. COCO) config Next).
Important: the default learning rate in the configuration file is 8 GPUs and 2 img / gpu (batch size = 8 * 2 = 16). according to Linear scaling rule , if you use different GPUs or each GPU processes a different number of pictures, you need to set the learning rate according to the batch size. For example, lr=0.01 for 4 GPUs * 2 IMGs / GPU and lr=0.08 for 16 GPUs * 4 IMGs / GPU.
Prepare dataset
Training also requires the preparation of data sets. For more information, see Prepare datasets
Note: Currently, configs/cityscapes under the configuration file are initialized with the weight of COCO pre training. If the network connection is unavailable or slow, you can download the existing model in advance. Otherwise, it will cause an error at the beginning of the training.
Train on a single GPU
We provide tools / train Py information about starting training on a single GPU. The basic usage is as follows.
python tools/train.py \ ${CONFIG_FILE} \ [optional arguments]
During training, log files and checkpoints will be saved to the working directory, which is called work_dir specify -- work dir in the config file or through CLI parameters.
By default, the model evaluates the validation set after each iteration, and the evaluation interval can be specified in the configuration file, as shown below.
# evaluate the model every 12 epoch. evaluation = dict(interval=12)
- The tool accepts several optional parameters, including:
- --No validate: disables evaluation during training.
- --Work dir ${work_dir}: overwrite the working directory.
- --Resume from ${checkpoint_file}: recover from the previous checkpoint file.
- --Options' key = value ': overrides other settings in the configuration used.
matters needing attention:
The difference between resume from and load from:
Resume from loads both model weights and optimizer state, and the iteration also inherits from the specified checkpoint. It is usually used to resume an unexpectedly interrupted training process. Load from only loads model weights, and the training period starts from 0. Usually used for fine tuning.
Training on multiple GPU s
We provide tools/dist_train.sh method of starting training on multiple GPU s. The basic usage is as follows.
bash ./tools/dist_train.sh \ ${CONFIG_FILE} \ ${GPU_NUM} \ [optional arguments]
Optional parameters and above Same as above.
Start multiple jobs at the same time
If you want to start multiple jobs on one computer, for example, two 4-GPU training jobs on a computer with 8 GPUs, you need to specify a different port for each job (29500 by default) to avoid communication conflicts.
If dist_train.sh is used to start the training job, you can set the port in the command.
CUDA_VISIBLE_DEVICES=0,1,2,3 PORT=29500 ./tools/dist_train.sh ${CONFIG_FILE} 4 CUDA_VISIBLE_DEVICES=4,5,6,7 PORT=29501 ./tools/dist_train.sh ${CONFIG_FILE} 4
Training on multiple nodes
MMDetection Distributed software package for distributed training. Therefore, as a basic usage, you can use PyTorch's Launch utility Start distributed training.
Manage jobs with Slurm
Managing positions using Slurm
Slurm It is a good job scheduling system for computing clusters. On a cluster managed by Slurm, you can slurm_train.sh is used to generate training assignments. It supports single node and multi node training.
The basic usage is as follows.
[GPUS=${GPUS}] ./tools/slurm_train.sh ${PARTITION} ${JOB_NAME} ${CONFIG_FILE} ${WORK_DIR}
The following is an example of using 16 GPU s to train Mask R-CNN on a Slurm partition named dev and set the working directory to some shared file systems.
GPUS=16 ./tools/slurm_train.sh dev mask_r50_1x configs/mask_rcnn_r50_fpn_1x_coco.py /nfs/xxxx/mask_rcnn_r50_fpn_1x
You can check source code To view the complete parameters and environment variables.
When using Slurm, you need to set the port options in one of the following ways:
- By setting the port -- options. This is recommended because it does not change the original configuration.
CUDA_VISIBLE_DEVICES=0,1,2,3 GPUS=4 ./tools/slurm_train.sh ${PARTITION} ${JOB_NAME} config1.py ${WORK_DIR} --options 'dist_params.port=29500' CUDA_VISIBLE_DEVICES=4,5,6,7 GPUS=4 ./tools/slurm_train.sh ${PARTITION} ${JOB_NAME} config2.py ${WORK_DIR} --options 'dist_params.port=29501'
2. Modify the configuration file to set different communication ports.
In config1 Py, set
dist_params = dict(backend='nccl', port=29500)
In config2 Py, set
dist_params = dict(backend='nccl', port=29501)
You can then use config1 Py and config2 Py starts two jobs.
CUDA_VISIBLE_DEVICES=0,1,2,3 GPUS=4 ./tools/slurm_train.sh ${PARTITION} ${JOB_NAME} config1.py ${WORK_DIR} CUDA_VISIBLE_DEVICES=4,5,6,7 GPUS=4 ./tools/slurm_train.sh ${PARTITION} ${JOB_NAME} config2.py ${WORK_DIR}