Engineering landing competition - TIPC and Serving support examples

General introduction

The project is mainly based on github project Introduce how to access TIPC and how to support serving. For more information about the original project, please see readme Understand MD. In addition, you can also refer to the official TIPC and Serving To connect your own model to TIPC and serve support.

TIPC basic chain

♣ brief introduction

In fact, what we want to do is to use a shell script to read txt files, and then achieve the purposes of train, eval, dynamic to static and infer. We hope that by running only one shell script, users can complete one or more of the above functions.

♠ Specific process

So since we are going to train, Eval, dynamic to static and infer, first we have to write their py script files. Generally, the corresponding are train py,eval.py,export_model.py and infer Py, we usually write the first two when writing algorithms, so let me introduce the next two script files.

Dynamic and static of model

In fact, it mainly refers to the transformation of the model from dynamic diagram to static diagram for later information and so on. The specific detailed code can be according to the directory ppdet / engine / trainer Modify line 531 in py.:

import os

import paddle
from paddle.static import InputSpec


def export(model, model_name, output_dir):
    # Adjust the model to eval mode
    model.eval()
    # Set the address where the static graph model is saved
    save_dir = os.path.join(output_dir, model_name)
    if not os.path.exists(save_dir):
        os.makedirs(save_dir)
    # Set the input shape to [3, - 1, - 1] if it is a dynamic shape
    image_shape = [3, -1, -1]
    # Set other inputs required
    input_spec = [{
        "image": InputSpec(
            shape=[None] + image_shape, name='image'),
        "im_shape": InputSpec(
            shape=[None, 2], name='im_shape'),
        "scale_factor": InputSpec(
            shape=[None, 2], name='scale_factor')
    }]

    static_model = paddle.jit.to_static(self.model, input_spec=input_spec)
    # NOTE: dy2st do not pruned program, but jit.save will prune program
    # input spec, prune input spec here and save with pruned input spec
    pruned_input_spec = _prune_input_spec(
        input_spec, 
        static_model.forward.main_program,
        static_model.forward.outputs
    )
    # Save model
    paddle.jit.save(
        static_model,
        os.path.join(save_dir, 'model'),
        input_spec=pruned_input_spec
    )


def _prune_input_spec(input_spec, program, targets):
    # try to prune static program to figure out pruned input spec
    # so we perform following operations in static mode
    paddle.enable_static()
    pruned_input_spec = [{}]
    program = program.clone()
    program = program._prune(targets=targets)
    global_block = program.global_block()

    for name, spec in input_spec[0].items():
        try:
            v = global_block.var(name)
            pruned_input_spec[0][name] = spec
        except Exception:
            pass
    paddle.disable_static()
    return pruned_input_spec

After the code is written, execute tools/export_model.py, and the static graph model will be saved to output_inference/retinanet_r50_fpn_1x_coco (the specific storage location depends on the program written by yourself).

# Install required dependencies
!pip install -r requirements.txt
!python tools/export_model.py -c configs/retinanet/retinanet_r50_fpn_1x_coco.yml -o weights=/home/aistudio/data/data104154/best_model.pdparams

Influence of model

Here, you can use argparse to receive the parameters entered when starting with the command line. For this part and the preparation of more visualization, video reasoning and preprocessing operator s, please refer to the directory: deploy / Python / infer py,deploy/python/preprocess.py

import os
import yaml

import numpy as np
import math
import paddle
from paddle.inference import Config
from paddle.inference import create_predictor
# RetinaNet-Based-on-PPdet-main/deploy/python/utils.py
# import some parameter settings, time calculation class and memory usage calculation class
from utils import argsparser, Timer, get_current_memory_mb


class Detector(object):
    def __init__(self,
                 pred_config,
                 model_dir,
                 device='CPU',
                 run_mode='fluid',
                 batch_size=1,
                 trt_min_shape=1,
                 trt_max_shape=1280,
                 trt_opt_shape=640,
                 trt_calib_mode=False,
                 cpu_threads=1,
                 enable_mkldnn=False):
        self.pred_config = pred_config
        # Initializing predictor is mainly used to set cpu\gpu, Mkldnn\TensorRT and other related settings
        self.predictor, self.config = load_predictor(
            model_dir,
            run_mode=run_mode,
            batch_size=batch_size,
            min_subgraph_size=self.pred_config.min_subgraph_size,
            device=device,
            use_dynamic_shape=self.pred_config.use_dynamic_shape,
            trt_min_shape=trt_min_shape,
            trt_max_shape=trt_max_shape,
            trt_opt_shape=trt_opt_shape,
            trt_calib_mode=trt_calib_mode,
            cpu_threads=cpu_threads,
            enable_mkldnn=enable_mkldnn)
        # These two functions are used to record time and memory
        self.det_times = Timer()
        self.cpu_mem, self.gpu_mem, self.gpu_util = 0, 0, 0
    # Preprocessing function
    def preprocess(self, image_list):
        # The main purpose here is to set the operator
        # If the preprocessing is fixed, you can also refer to the following link and directly Compose together 
        # https://github.com/littletomatodonkey/AlexNet-Prod/blob/tipc/pipeline/Step5/AlexNet_paddle/deploy/inference_python/infer.py#L48
        preprocess_ops = []
        for op_info in self.pred_config.preprocess_infos:
            new_op_info = op_info.copy()
            op_type = new_op_info.pop('type')
            preprocess_ops.append(eval(op_type)(**new_op_info))
    
        input_im_lst = []
        input_im_info_lst = []
        for im_path in image_list:
            im, im_info = preprocess(im_path, preprocess_ops)
            input_im_lst.append(im)
            input_im_info_lst.append(im_info)
        # Convert the preprocessed input into the format required by the model (depending on your own model, this function may not be required)
        inputs = create_inputs(input_im_lst, input_im_info_lst)
        return inputs
    # Post processing
    def postprocess(self,
                    np_boxes,
                    np_masks,
                    inputs,
                    np_boxes_num,
                    threshold=0.5):
        # postprocess output of predictor
        results = {}
        results['boxes'] = np_boxes
        results['boxes_num'] = np_boxes_num
        if np_masks is not None:
            results['masks'] = np_masks
        return results

    def predict(self, image_list, threshold=0.5, warmup=0, repeats=1):
        # The preprocessing obtains the input required by the model
        self.det_times.preprocess_time_s.start()
        inputs = self.preprocess(image_list)
        self.det_times.preprocess_time_s.end()

        np_boxes, np_masks = None, None
        input_names = self.predictor.get_input_names()
        # Get the name of input and output
        for i in range(len(input_names)):
            input_tensor = self.predictor.get_input_handle(input_names[i])
            input_tensor.copy_from_cpu(inputs[input_names[i]])
        for i in range(warmup):
            self.predictor.run()
            output_names = self.predictor.get_output_names()
            boxes_tensor = self.predictor.get_output_handle(output_names[0])
            np_boxes = boxes_tensor.copy_to_cpu()
            if self.pred_config.mask:
                masks_tensor = self.predictor.get_output_handle(output_names[2])
                np_masks = masks_tensor.copy_to_cpu()

        self.det_times.inference_time_s.start()
        for i in range(repeats):
            self.predictor.run()
            output_names = self.predictor.get_output_names()
            boxes_tensor = self.predictor.get_output_handle(output_names[0])
            np_boxes = boxes_tensor.copy_to_cpu()
            boxes_num = self.predictor.get_output_handle(output_names[1])
            np_boxes_num = boxes_num.copy_to_cpu()
            if self.pred_config.mask:
                masks_tensor = self.predictor.get_output_handle(output_names[2])
                np_masks = masks_tensor.copy_to_cpu()
        self.det_times.inference_time_s.end(repeats=repeats)

        self.det_times.postprocess_time_s.start()
        results = []
        # Return results after post-processing
        if reduce(lambda x, y: x * y, np_boxes.shape) < 6:
            print('[WARNNING] No object detected.')
            results = {'boxes': np.array([[]]), 'boxes_num': [0]}
        else:
            results = self.postprocess(
                np_boxes, np_masks, inputs, np_boxes_num, threshold=threshold)
        self.det_times.postprocess_time_s.end()
        self.det_times.img_num += len(image_list)
        return results

    def get_timer(self):
        return self.det_times


def load_predictor(model_dir,
                   run_mode='fluid',
                   batch_size=1,
                   device='CPU',
                   min_subgraph_size=3,
                   use_dynamic_shape=False,
                   trt_min_shape=1,
                   trt_max_shape=1280,
                   trt_opt_shape=640,
                   trt_calib_mode=False,
                   cpu_threads=1,
                   enable_mkldnn=False):
    if device != 'GPU' and run_mode != 'fluid':
        raise ValueError(
            "Predict by TensorRT mode: {}, expect device=='GPU', but device == {}"
            .format(run_mode, device))
    # Use the api of information to load the model
    config = Config(
        os.path.join(model_dir, 'model.pdmodel'),
        os.path.join(model_dir, 'model.pdiparams')
    )
    if device == 'GPU':
        # initial GPU memory(M), device ID
        config.enable_use_gpu(200, 0)
        # optimize graph and fuse op
        config.switch_ir_optim(True)
    elif device == 'XPU':
        config.enable_xpu(10 * 1024 * 1024)
    else:
        config.disable_gpu()
        config.set_cpu_math_library_num_threads(cpu_threads)
        if enable_mkldnn:
            try:
                # cache 10 different shapes for mkldnn to avoid memory leak
                config.set_mkldnn_cache_capacity(10)
                config.enable_mkldnn()
            except Exception as e:
                print(
                    "The current environment does not support `mkldnn`, so disable mkldnn."
                )
                pass

    precision_map = {
        'trt_int8': Config.Precision.Int8,
        'trt_fp32': Config.Precision.Float32,
        'trt_fp16': Config.Precision.Half
    }
    if run_mode in precision_map.keys():
        config.enable_tensorrt_engine(
            workspace_size=1 << 10,
            max_batch_size=batch_size,
            min_subgraph_size=min_subgraph_size,
            precision_mode=precision_map[run_mode],
            use_static=False,
            use_calib_mode=trt_calib_mode)

        if use_dynamic_shape:
            min_input_shape = {
                'image': [batch_size, 3, trt_min_shape, trt_min_shape]
            }
            max_input_shape = {
                'image': [batch_size, 3, trt_max_shape, trt_max_shape]
            }
            opt_input_shape = {
                'image': [batch_size, 3, trt_opt_shape, trt_opt_shape]
            }
            config.set_trt_dynamic_shape_info(min_input_shape, max_input_shape,
                                              opt_input_shape)
            print('trt set dynamic shape done!')

    # disable print log when predict
    config.disable_glog_info()
    # enable shared memory
    config.enable_memory_optim()
    # disable feed, fetch OP, needed by zero_copy_run
    config.switch_use_feed_fetch_ops(False)
    predictor = create_predictor(config)
    return predictor, config

'''
Mainly divided into bs Whether to wait for 1 processing, considering that it may be necessary to padding. 
In addition, put the required fields and related information into a dictionary, and then return.
'''
def create_inputs(imgs, im_info):
    inputs = {}

    im_shape = []
    scale_factor = []
    if len(imgs) == 1:
        inputs['image'] = np.array((imgs[0], )).astype('float32')
        inputs['im_shape'] = np.array(
            (im_info[0]['im_shape'], )).astype('float32')
        inputs['scale_factor'] = np.array(
            (im_info[0]['scale_factor'], )).astype('float32')
        return inputs

    for e in im_info:
        im_shape.append(np.array((e['im_shape'], )).astype('float32'))
        scale_factor.append(np.array((e['scale_factor'], )).astype('float32'))

    inputs['im_shape'] = np.concatenate(im_shape, axis=0)
    inputs['scale_factor'] = np.concatenate(scale_factor, axis=0)

    imgs_shape = [[e.shape[1], e.shape[2]] for e in imgs]
    max_shape_h = max([e[0] for e in imgs_shape])
    max_shape_w = max([e[1] for e in imgs_shape])
    padding_imgs = []
    for img in imgs:
        im_c, im_h, im_w = img.shape[:]
        padding_im = np.zeros(
            (im_c, max_shape_h, max_shape_w), dtype=np.float32)
        padding_im[:, :im_h, :im_w] = img
        padding_imgs.append(padding_im)
    inputs['image'] = np.stack(padding_imgs, axis=0)
    return inputs

You can load the static graph model just exported for the prepared information code and test the pictures

!python ./deploy/python/infer.py --device=gpu --model_dir=output_inference/retinanet_r50_fpn_1x_coco --batch_size=1 --image_dir=demo

The visualization of the results will be saved to \output:

About the preparation of txt command configuration file and shell script

  • txt command configuration file

The shell script reads the information of TXT file through the number of lines, which corresponds to the number of lines. For example, in the shell script, write the fourth line (counting from 0) in the txt file to read whether to use gpu. At this time, if you put use_gpu:True writes to line 6, so the shell reads the wrong information. Therefore, do not add or delete rows at will. If you have to add or delete rows, you need to modify the corresponding shell script at the same time.
The following comments can be deleted when they are officially used (note that the double # number in the line should not be deleted).

===========================train_params===========================
model_name:retinanet_r50_fpn_1x_coco # The name of the output folder used as the general storage model
python:python3.7
gpu_list:0 # It is generally set to 0, and multi gpu is not supported temporarily
use_gpu:True # Whether to use gpu. If the cpu needs to be tested in addition to gpu, it is written as True|False
auto_cast:null|amp # Is half precision tested
epoch:lite_train_lite_infer=2|lite_train_whole_infer=1|whole_train_whole_infer=12
save_dir:tipc/train_infer_python/output/retinanet_r50 # output saved address
TrainReader.batch_size:lite_train_lite_infer=2|lite_train_whole_infer=2|whole_train_whole_infer=1
pretrain_weights:/home/aistudio/data/data104154/best_model.pdparams # Pre training weight position, you can also put the optimal weight
train_model_name:model_final.pdparams # Name of weight saved by train
train_infer_img_dir:./dataset/coco/test2017/ # Dataset location
filename:retinanet_r50_fpn_1x_coco
##
trainer:norm_train # This is only for norm_ Just verify the train pattern
norm_train:tools/train.py -c configs/retinanet/retinanet_r50_fpn_1x_coco.yml -o # Here is the command line for running train. Generally, the configuration file location after - c needs to be modified
pact_train:null
fpgm_train:null
distill_train:null
null:null
null:null
##
===========================eval_params===========================
eval:tools/eval.py -c configs/retinanet/retinanet_r50_fpn_1x_coco.yml -o # Here is the command line to run eval
null:null
##
===========================infer_params===========================
--output_dir:./output_infer/python/retinanet_r50 # Output location of information
weights:/home/aistudio/data/data104154/best_model.pdparams # Optimal weight for information loading
norm_export:tools/export_model.py -c configs/retinanet/retinanet_r50_fpn_1x_coco.yml -o # Command line for static graph model export
pact_export:null
fpgm_export:null
distill_export:null
export1:null
export2:null
kl_quant_export:null
##
infer_mode:norm
infer_quant:False
inference:./deploy/python/infer.py # Location of the infer script
--device:gpu|cpu # gpu and cpu are tested
--enable_mkldnn:True|False # Whether mkldnn is used in cpu test
--cpu_threads:1|4 # Testing different threads
--batch_size:1
--use_tensorrt:null # Is tensorrt used in gpu testing
--run_mode:fluid # Here you can set the accuracy of tensorrt, which is not required this time
--model_dir:tipc/train_infer_python/output/retinanet_r50/norm_train_gpus_0_autocast_null/retinanet_r50_fpn_1x_coco # Weight for information loading
--image_dir:./dataset/coco/test2017/ # Image required by information
--save_log_path:null
--run_benchmark:False
--trt_max_shape:1600

For inference, run infer Other parameters of Py configuration, such as device and enable_mkldnn, etc., is the load input into the previous code_ The predictor function initializes the predictor.

  • About shell script files. In fact, there are not many places that need to be modified. You may need to input the parameters required by your own txt file and py script file.
    • You may need to parse the lines you added in txt, refer to the first two lines below, and update the number of lines updated by other parameters (because you added a line, the I of $(func_parser_key "${lines[i]}") in parameter parsing after it should be increased by 1.
    • Add or delete args or kwargs according to the parameters required by your py script, as shown in the following, "${set_use_gpu} ${set_save_model} ${set_epoch} ${set_pretrain} ${set_batchsize} ${set_filename} ${set_autocast}"
your_params_key=$(func_parser_key "${lines[4]}")
your_params_value=$(func_parser_value "${lines[4]}")

cmd="${python} ${run_train} ${set_use_gpu} ${set_save_model} ${set_epoch} ${set_pretrain} ${set_batchsize} ${set_filename} ${set_autocast}"

Lightweight verification of model

In fact, if the TIPC verification of the original model is passed, the TIPC of lightweight verification is very easy. Here, take mobilenetv1 as an example, first replace the backbone of the original model with mobilenetv1. What needs to be determined is the characteristics of which layers you need to return. Here you can understand the network structure through reading the source code: retinanet based on ppdet main / ppdet / modeling / backbones / mobilenet_ v1. Py, here we need to return the features of layers 4, 6 and 13. The stripe of these layers is just 8, 16 and 32. The pre training weight must also use mobilenetv1. The optimizer may also need to be modified, and the rest can use the original:

_BASE_: [
  '../datasets/coco_detection.yml',
  '../runtime.yml',
  '_base_/retinanet_r50_fpn.yml',
  '_base_/optimizer_1x.yml',
  '_base_/retinanet_reader.yml',
]

pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/MobileNetV1_pretrained.pdparams

RetinaNet:
  backbone: MobileNet

MobileNet:
  scale: 1
  feature_maps: [4, 6, 13]
  with_extra_blocks: false
  extra_block_filters: []

Then, after training to obtain the optimal weight, modify the relevant part of the original txt file to the content of mobilenetv1 (for example, the optimal weight bit is replaced by mobilenetv1). The two comparison files here are:
tipc/train_infer_python/configs/retinanet/retinanet_mobilenet_v1_fpn_1x_coco.txt
tipc/train_infer_python/configs/retinanet/retinanet_r50_fpn_1x_coco.txt

♦ Effect experience and display

For more details, please refer to the file tipc/train_infer_python/README.md

  • Installing various dependencies and preparing datasets
!pip install -r requirements.txt
!bash tipc/train_infer_python/prepare.sh tipc/train_infer_python/configs/retinanet/retinanet_r50_fpn_1x_coco.txt 'lite_train_lite_infer'
  • To retinanet_r50_fpn_1x_coco uses a small amount of data for training and reasoning
!bash tipc/train_infer_python/test_train_inference_python.sh tipc/train_infer_python/configs/retinanet/retinanet_r50_fpn_1x_coco.txt 'lite_train_lite_infer'

If the operation is successful, it will output as follows:

Run successfully with command - python3.7 tools/train.py -c configs/retinanet/retinanet_r50_fpn_1x_coco.yml -o use_gpu=True save_dir=tipc/train_infer_python/output/retinanet_r50/norm_train_gpus_0_autocast_null epoch=2 pretrain_weights=/home/aistudio/data/data104154/best_model.pdparams TrainReader.batch_size=2 filename=retinanet_r50_fpn_1x_coco  ! 
Run successfully with command - python3.7 tools/eval.py -c configs/retinanet/retinanet_r50_fpn_1x_coco.yml -o weights=tipc/train_infer_python/output/retinanet_r50/norm_train_gpus_0_autocast_null/retinanet_r50_fpn_1x_coco/model_final.pdparams use_gpu=True  !  
Run successfully with command - python3.7 tools/export_model.py -c configs/retinanet/retinanet_r50_fpn_1x_coco.yml -o weights=tipc/train_infer_python/output/retinanet_r50/norm_train_gpus_0_autocast_null/retinanet_r50_fpn_1x_coco/model_final.pdparams filename=retinanet_r50_fpn_1x_coco --output_dir=./output_infer/python/retinanet_r50 !
Run successfully with command - python3.7 ./deploy/python/infer.py --device=gpu --run_mode=fluid --model_dir=./output_infer/python/retinanet_r50/retinanet_r50_fpn_1x_coco --batch_size=1 --image_dir=./dataset/coco/test2017/ --run_benchmark=False --trt_max_shape=1600 --output_dir=./output_infer/python/retinanet_r50 > tipc/train_infer_python/output/retinanet_r50/python_infer_gpu_precision_fluid_batchsize_1.log 2>&1 !
  • To retinanet_mobilenet_v1_fpn_1x_coco uses a small amount of data for training and reasoning
!bash tipc/train_infer_python/test_train_inference_python.sh tipc/train_infer_python/configs/retinanet/retinanet_mobilenet_v1_fpn_1x_coco.txt 'lite_train_lite_infer'

If the operation is successful, the above information will also be output, and the output and log files will be saved to tipc/train_infer_python/output. Visualize the output as follows:

Serving support

♣ brief introduction

What we need to do in service deployment is to load the model on the server side, then access it through the client side and reason. The model loaded here is converted from the static graph model, so you need to export the static graph model as before.

♠ Specific process

If we have done TIPC, we already have the static graph model of the model. Then, after we convert the static graph model to get the required serving model, we mainly carry out server-side, client-side and some pre-processing and post-processing, and finally connect it to TIPC, which is to write a shell script and corresponding txt file as before.

Environmental preparation

First, install some packages required for serving, check your environment, and select the corresponding version for installation:

!nvidia-smi
Tue Jan 25 09:56:06 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.67       Driver Version: 418.67       CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla V100-SXM2...  On   | 00000000:05:00.0 Off |                    0 |
| N/A   34C    P0    40W / 300W |      0MiB / 32480MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+
# Download package
!wget https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_server_gpu-0.7.0.post101-py3-none-any.whl
!wget https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_client-0.7.0-cp37-none-any.whl
!wget https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_app-0.7.0-py3-none-any.whl
# Installation package
!pip install paddle_serving_server_gpu-0.7.0.post101-py3-none-any.whl
!pip install paddle_serving_client-0.7.0-cp37-none-any.whl
!pip install paddle_serving_app-0.7.0-py3-none-any.whl

Use the following command to convert the static graph model into the model required by serving

'''
--dirname                                        Static graph model saving directory
--model_filename                                 Static graph model file name   
--params_filename model.pdiparams 
--serving_server deploy/serving/serving_server   Server file saving directory         
--serving_client deploy/serving/serving_client   Client file save directory     
'''
!python3 -m paddle_serving_client.convert --dirname output_inference/retinanet_r50_fpn_1x_coco --model_filename model.pdmodel --params_filename model.pdiparams --serving_server deploy/serving/serving_server --serving_client deploy/serving/serving_client          

The output model can be found in the directory deploy/serving.

Server and its configuration file

The server mainly needs to write input preprocessing and result post-processing. You can refer to the following code

class RetinaNetOp(Op):
    def init_op(self):
        # Here are some preprocessing operations required for compose
        self.eval_transforms = Compose([
            Resize(target_size=[800, 1333]), 
            NormalizeImage(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224,0.225]), 
            Permute(),
            PadStride(32)
        ])

    def preprocess(self, input_dicts, data_id, log_id):
        # In fact, this is mainly for the case where bs is 1
        (_, input_dict), = input_dicts.items()
        batch_size = len(input_dict.keys())
        imgs = []
        imgs_info = {'im_shape':[], 'scale_factor':[]}
        for key in input_dict.keys():
            # Decode incoming data
            data = base64.b64decode(input_dict[key].encode('utf8'))
            img = cv2.imdecode(np.frombuffer(data, np.uint8), cv2.IMREAD_COLOR)
            img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
            # Add fields required by the model
            im_info = {
                'scale_factor': np.array([1., 1.], dtype=np.float32),
                'im_shape': img.shape[:2],
            }
            # Data enhancement and preprocessing
            img, im_info = self.eval_transforms(img, im_info)
            imgs.append(img[np.newaxis, :].copy())
            imgs_info["im_shape"].append(im_info["im_shape"][np.newaxis, :].copy())
            imgs_info["scale_factor"].append(im_info["scale_factor"][np.newaxis, :].copy())
            
        input_imgs = np.concatenate(imgs, axis=0)
        input_im_shape = np.concatenate(imgs_info["im_shape"], axis=0)
        input_scale_factor = np.concatenate(imgs_info["scale_factor"], axis=0)
        # The final return only needs to take care of the first one. Its content is the input required for your model reasoning
        return {"image": input_imgs, "im_shape": input_im_shape, "scale_factor": input_scale_factor}, False, None, ""

    def postprocess(self, input_dicts, fetch_dict, data_id, log_id):
        # You can use deploy / serving / serving here_ server/serving_ server_ Conf.prototxt transforms de'dao through static graph model
        np_boxes = list(fetch_dict.values())[0]
        # Here is the output after nms. Just follow this for post-processing
        keep = (np_boxes[:, 1] > 0.5) & (np_boxes[:, 0] > -1)
        np_boxes = np_boxes[keep, :]
        result = {"class_id": [], "confidence": [], "left_top": [], "right_bottom": []}
        for dt in np_boxes:
            clsid, bbox, score = int(dt[0]), dt[2:], dt[1]

            xmin, ymin, xmax, ymax = bbox

            result["class_id"].append(clsid)
            result["confidence"].append(score)
            result["left_top"].append([xmin, ymin])
            result["right_bottom"].append([xmax, ymax])

        result["class_id"] = str(result["class_id"])
        result["confidence"] = str(result["confidence"])
        result["left_top"] = str(result["left_top"])
        result["right_bottom"] = str(result["right_bottom"])

        return result, None, ""

# The rest are fixed operations
class RetinaNetService(WebService):
    def get_pipeline_response(self, read_op):
        retinanet_op = RetinaNetOp(name="retinanet", input_ops=[read_op])
        return retinanet_op

# define the service class
uci_service = RetinaNetService(name="retinanet")
# load config and prepare the service
uci_service.prepare_pipeline_config("config.yml")
# start the service
uci_service.run_service()

As for the preparation of server-side configuration file, there is not much to be written in this part:

op:
    # op name, and Web_ The initialization name parameter of Service class in Service is consistent
    retinanet:
        #Concurrent number, is_ thread_ When OP = true, it refers to thread concurrency; Otherwise, it is process concurrency
        concurrency: 1

        #When the op configuration does not have a server_endpoints, from local_service_conf read local service configuration
        local_service_conf:
            # Location of service model export
            model_config: "./serving_server"

For details, please refer to:
deploy/serving/web_service.py deploy/serving/config.yml
For the preparation of preprocessing operators, please refer to deploy/serving/preprocess_ops.py can be added according to your needs.

Here, you need to take the terminal and run python3 deploy / serving / Web_ service. Py &, note here that if you have run it before, you need to kill it as shown in the figure below. In addition, all paths at this time need to be changed to absolute paths, and the later access tipc uses relative paths.

Writing of client

The client is simpler. It mainly explains which model you want to use to access and which picture you want to test. The details are as follows:

def get_args(add_help=True):
    import argparse
    parser = argparse.ArgumentParser(
        description='Paddle Serving', add_help=add_help)
    # Pictures to be tested
    parser.add_argument('--img_path', default="dataset/coco/test2017/000000575930.jpg")
    args = parser.parse_args()
    return args

# Encode input
def cv2_to_base64(image):
    return base64.b64encode(image).decode('utf8')


def main(args):
    # Visit the url of your model and change it to your model name
    url = "http://127.0.0.1:18080/retinanet/prediction"
    logid = 10000

    img_path = args.img_path
    with open(img_path, 'rb') as file:
        image_data1 = file.read()
    # data should be transformed to the base64 format
    image = cv2_to_base64(image_data1)
    data = {"key": ["image"], "value": [image], "logid": logid}
    # send requests
    r = requests.post(url=url, data=json.dumps(data))
    print(r.json())


if __name__ == "__main__":
    args = get_args()
    main(args)

Similarly, test the writing client. After starting the server according to the above method, use ctrl c, and then enter the command in the terminal to run the client python3 deploy/serving/pipeline_http_client.py

Access TIPC

serving access to TIPC is not as complex as before, and there are few things that need to be changed. The points needing attention are the same as before. If the txt file is written:

===========================serving_params===========================
model_name:RetinaNet
python:python3.7
trans_model:-m paddle_serving_client.convert
--dirname:output_infer/python/retinanet_r50/retinanet_r50_fpn_1x_coco # The location of the static graph model, which is used to export the serving model
--model_filename:model.pdmodel
--params_filename:model.pdiparams
--serving_server:deploy/serving/serving_server # server side model export location
--serving_client:deploy/serving/serving_client # Location of client-side model export
serving_dir:./deploy/serving
web_service:web_service.py
op.alexnet.local_service_conf.devices:0
null:null
null:null
null:null
null:null
pipline:pipeline_http_client.py
ervice_conf.devices:0
null:null
null:null
null:null
null:null
pipline:pipeline_http_client.py
--img_path:../../dataset/coco/test2017/000000575930.jpg # Pictures to be tested

♦ Effect experience and display

For more detailed introduction and environment configuration, please refer to the file TIPC / serving / readme MD, first according to the dependency:

!pip install -r requirements.txt

Test

!bash tipc/serving/test_serving.sh tipc/serving/configs/retinanet_r50_fpn_1x_coco.txt
################### run test ###################
/home/aistudio/deploy/serving

2022/01/25 11:23:14 start proxy service
W0125 11:23:18.311997  6867 analysis_predictor.cc:795] The one-time configuration of analysis predictor failed, which may be due to native predictor called first and its configurations taken effect.
I0125 11:23:18.441354  6867 analysis_predictor.cc:665] ir_optim is turned off, no IR pass will be executed
[1m[35m--- Running analysis [ir_graph_build_pass][0m
[1m[35m--- Running analysis [ir_graph_clean_pass][0m
[1m[35m--- Running analysis [ir_analysis_pass][0m
[1m[35m--- Running analysis [ir_params_sync_among_devices_pass][0m
I0125 11:23:18.746263  6867 ir_params_sync_among_devices_pass.cc:45] Sync params from CPU to GPU
[1m[35m--- Running analysis [adjust_cudnn_workspace_size_pass][0m
[1m[35m--- Running analysis [inference_op_replace_pass][0m
[1m[35m--- Running analysis [memory_optimize_pass][0m
I0125 11:23:18.907073  6867 memory_optimize_pass.cc:216] Cluster name : reshape2_35.tmp_1  size: 0
I0125 11:23:18.907119  6867 memory_optimize_pass.cc:216] Cluster name : fill_constant_43.tmp_0  size: 4
I0125 11:23:18.907122  6867 memory_optimize_pass.cc:216] Cluster name : fill_constant_41.tmp_0  size: 4
I0125 11:23:18.907131  6867 memory_optimize_pass.cc:216] Cluster name : im_shape  size: 8
I0125 11:23:18.907136  6867 memory_optimize_pass.cc:216] Cluster name : scale_factor  size: 8
I0125 11:23:18.907138  6867 memory_optimize_pass.cc:216] Cluster name : image  size: 12
I0125 11:23:18.907143  6867 memory_optimize_pass.cc:216] Cluster name : conv2d_181.tmp_1  size: 144
I0125 11:23:18.907147  6867 memory_optimize_pass.cc:216] Cluster name : batch_norm_52.tmp_3  size: 8192
I0125 11:23:18.907155  6867 memory_optimize_pass.cc:216] Cluster name : relu_39.tmp_0  size: 4096
I0125 11:23:18.907160  6867 memory_optimize_pass.cc:216] Cluster name : conv2d_123.tmp_0  size: 8192
I0125 11:23:18.907163  6867 memory_optimize_pass.cc:216] Cluster name : batch_norm_49.tmp_1  size: 8192
I0125 11:23:18.907166  6867 memory_optimize_pass.cc:216] Cluster name : conv2d_161.tmp_1  size: 144
I0125 11:23:18.907169  6867 memory_optimize_pass.cc:216] Cluster name : conv2d_171.tmp_1  size: 144
I0125 11:23:18.907172  6867 memory_optimize_pass.cc:216] Cluster name : relu_45.tmp_0  size: 8192
I0125 11:23:18.907176  6867 memory_optimize_pass.cc:216] Cluster name : elementwise_add_15  size: 8192
I0125 11:23:18.907179  6867 memory_optimize_pass.cc:216] Cluster name : reshape2_28.tmp_0  size: 320
I0125 11:23:18.907183  6867 memory_optimize_pass.cc:216] Cluster name : conv2d_141.tmp_1  size: 144
I0125 11:23:18.907186  6867 memory_optimize_pass.cc:216] Cluster name : relu_21.tmp_0  size: 2048
I0125 11:23:18.907189  6867 memory_optimize_pass.cc:216] Cluster name : relu_88.tmp_0  size: 1024
I0125 11:23:18.907197  6867 memory_optimize_pass.cc:216] Cluster name : conv2d_151.tmp_1  size: 144
[1m[35m--- Running analysis [ir_graph_to_program_pass][0m
I0125 11:23:19.583788  6867 analysis_predictor.cc:714] ======= optimize end =======
I0125 11:23:19.620709  6867 naive_executor.cc:98] ---  skip [feed], feed -> scale_factor
I0125 11:23:19.620766  6867 naive_executor.cc:98] ---  skip [feed], feed -> image
I0125 11:23:19.620771  6867 naive_executor.cc:98] ---  skip [feed], feed -> im_shape
I0125 11:23:19.632345  6867 naive_executor.cc:98] ---  skip [_generated_var_22], fetch -> fetch
I0125 11:23:19.632387  6867 naive_executor.cc:98] ---  skip [_generated_var_23], fetch -> fetch
W0125 11:23:19.708725  6867 device_context.cc:447] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 10.1, Runtime API Version: 10.1
W0125 11:23:19.712651  6867 device_context.cc:465] device: 0, cuDNN Version: 7.6.
{'err_no': 0, 'err_msg': '', 'key': ['class_id', 'confidence', 'left_top', 'right_bottom'], 'value': ['[0, 27, 39, 39, 39, 39, 48, 48]', '[0.9298271, 0.78884697, 0.609955, 0.56487834, 0.56370527, 0.5328276, 0.6830632, 0.67401433]', '[[288.6603, 9.321735], [412.90067, 172.55153], [539.206, 3.6034787], [557.5477, 4.5205536], [521.5307, 4.789155], [571.5571, 0.0], [15.654112, 242.51068], [202.25995, 197.21396]]', '[[638.92633, 390.8219], [477.95944, 296.9499], [559.34314, 64.882324], [572.8147, 40.891556], [538.74994, 67.22812], [583.3348, 42.266556], [213.04216, 322.14337], [368.59772, 320.33978]]'], 'tensors': []}
[33m Run successfully with command - python3.7 pipeline_http_client.py --img_path=../../dataset/coco/test2017/000000575930.jpg> ../../tipc/serving/output/server_infer_gpu_pipeline_http_usetrt_null_precision_null_batchsize_1.log 2>&1!  [0m

The specific output is saved in tipc/serving/output, and the serving output can be:

{'err_no': 0, 'err_msg': '', 'key': ['class_id', 'confidence', 'left_top', 'right_bottom'], 'value': ['[0, 27, 39, 39, 39, 39, 48, 48]', '[0.9298271, 0.78884697, 0.609955, 0.56487834, 0.56370527, 0.5328276, 0.6830632, 0.67401433]', '[[288.6603, 9.321735], [412.90067, 172.55153], [539.206, 3.6034787], [557.5477, 4.5205536], [521.5307, 4.789155], [571.5571, 0.0], [15.654112, 242.51068], [202.25995, 197.21396]]', '[[638.92633, 390.8219], [477.95944, 296.9499], [559.34314, 64.882324], [572.8147, 40.891556], [538.74994, 67.22812], [583.3348, 42.266556], [213.04216, 322.14337], [368.59772, 320.33978]]'], 'tensors': []}

Compare with the information output

class_id:0, confidence:0.9298, left_top:[288.66,9.32],right_bottom:[638.93,390.82]
class_id:27, confidence:0.7888, left_top:[412.90,172.55],right_bottom:[477.96,296.95]
class_id:39, confidence:0.6100, left_top:[539.21,3.60],right_bottom:[559.34,64.88]
class_id:39, confidence:0.5649, left_top:[557.55,4.52],right_bottom:[572.81,40.89]
class_id:39, confidence:0.5637, left_top:[521.53,4.79],right_bottom:[538.75,67.23]
class_id:39, confidence:0.5328, left_top:[571.56,0.00],right_bottom:[583.33,42.27]
class_id:48, confidence:0.6831, left_top:[15.65,242.51],right_bottom:[213.04,322.14]
class_id:48, confidence:0.6740, left_top:[202.26,197.21],right_bottom:[368.60,320.34]

Completely consistent πŸš€πŸš€πŸš€οΌŒ This is the end 🌸🌸🌸

Keywords: Python AI TensorFlow Deep Learning paddlepaddle

Added by Agtronic on Fri, 11 Feb 2022 23:58:22 +0200