You may also want to have your own AI model file format (reasoning deployment)-

If there are new audiences, if you have any doubts, please read from the beginning of the column "special ai model". Especially for viewers who want to thoroughly understand the origin of the model in deep learning and how to build a custom model format step by step, in addition to the visualization chapter, the first three chapters must be read carefully.

And why do I have elegance to open "reasoning deployment"? The main reason is that at present, my personal ambition is a little big. I want to build a complete set of tool chain about reasoning deployment from scratch. That is, work towards the reasoning engine. Especially since the end of visualization last time, I always feel that I still have more meaning.

Therefore, a new chapter has been opened, and the number of articles in this chapter may be controlled below 10. However, new articles may be added as the follow-up goes deeper. At least one goal at this stage is to enable the special ai module to use opencl for reasoning deployment. Naturally, any inference engine contains a function called front-end parsing. Different inference frameworks support different front-end models. Some can access pytorch, tensorflow, darknet, cafe, etc. However, our main purpose is to support our own model format, so the workload of front-end parsing is much less for us.

According to the definition of ai module in the first article of this column, we can easily know the goal of our front-end parsing: parsing generated pzkm model file, from which the auxiliary information, tensor information, network layer information, etc. are stripped. These information is best organized into C + + structure or class variables, so that the subsequent reasoning interface can form the reasoning network.

1, New structure

Before, we only wrote the interface that can generate the model, so the Tensor related structures did not consider the special, so we directly generated the model file. Now we hope that the follow-up code can be separated from the use of flatbuffers, so we specially follow it The definition of fbs is written in C + + structure as follows:

// TensorShape
struct TensorShapeS{
    uint8_t dimsize;
    std::vector<uint32_t> dims;
};
// Weights
struct WeightsS{
    uint8_t ele_bytes;
    uint64_t ele_num;
    std::vector<uint8_t> buffer;
};
// Tensor
struct TensorsS{
    uint32_t id;
    std::string name;
    TensorType tensor_type;
    DataType data_type;
    DataLayout data_layout;
    struct TensorShapeS shape;
    struct WeightsS weights;
};

2, Defines the interface for reading model files

In order to reduce the writing of new classes and reuse the code as much as possible, we still intend to add a front-end parsing interface to PzkM class to read the model, as shown below:

// main class for build pzkmodel
class PzkM
{
private:
    /* data */
public:
.....
    //The front-end parsing interface can realize parsing by passing in the model file path
    bool ReadModel(std::string modelfile);
.....
}

3, Write an interface to read model files

The logic of front-end parsing here is actually very simple, which is to use it as much as possible For all metadata inside PModel in fbs, the parsing process is as follows:

bool PzkM::ReadModel(std::string modelfile)
{
    // 1. Clear all data of the model for initialization
    this->clear_data();
    // 2. Use the flatbuffer interface to parse the binary data of the model read out
    std::ifstream infile;
    infile.open(modelfile.c_str(), std::ios::binary | std::ios::in);
    infile.seekg(0, std::ios::end);
    int length = infile.tellg();
    infile.seekg(0, std::ios::beg);
    char* data = new char[length];
    infile.read(data, length);
    infile.close();
    auto my_model = GetPModel(data);
    
    // 3. Read network information from the model
    // 3.1 reading auxiliary information
    this->author = std::string(my_model->author()->c_str());
    this->version = std::string(my_model->version()->c_str());
    this->model_name = std::string(my_model->model_name()->c_str());
    this->model_runtime_input_num = my_model->model_runtime_input_num();
    this->model_runtime_output_num = my_model->model_runtime_output_num();
    auto rt_input_id = my_model->model_runtime_input_id();
    for (size_t i = 0; i < rt_input_id->size(); i++)
        this->model_runtime_input_id.push_back(rt_input_id->Get(i));
    auto rt_output_id = my_model->model_runtime_output_id();
    for (size_t j = 0; j < rt_output_id->size(); j++)
        this->model_runtime_output_id.push_back(rt_output_id->Get(j));
    // 3.2 reading tensor information
    auto rt_tensor_buffer = my_model->tensor_buffer();
    for (size_t m = 0; m < rt_tensor_buffer->size(); m++){
        auto one_tensor = rt_tensor_buffer->Get(m);
        struct TensorsS tess;
        tess.shape.dims.clear();
        tess.weights.buffer.clear();
        tess.id = one_tensor->id();
        tess.name = std::string(one_tensor->name()->c_str());
        tess.tensor_type = one_tensor->tesor_type();
        tess.data_type = one_tensor->data_type();
        tess.data_layout = one_tensor->data_layout();
        tess.shape.dimsize = one_tensor->shape()->dimsize();
        for (size_t n = 0; n < tess.shape.dimsize; n++)
            tess.shape.dims.push_back(one_tensor->shape()->dims()->Get(n));
        if (tess.tensor_type == TensorType_CONST){
            tess.weights.ele_bytes = one_tensor->weights()->ele_bytes();
            tess.weights.ele_num = one_tensor->weights()->ele_num();
            for (size_t n = 0; n < one_tensor->weights()->buffer()->size(); n++)
                tess.weights.buffer.push_back(one_tensor->weights()->buffer()->Get(n));
        }
        this->rTensors.push_back(tess);
    }
    // 3.3 reading network layer information
    auto rt_layer_buffer = my_model->layer_buffer();
    for (size_t m = 0; m < rt_layer_buffer->size(); m++){
        auto one_layer = rt_layer_buffer->Get(m);
        layer_maker l = layer_maker(meta.get_meta(std::string(one_layer->type()->c_str())), one_layer->id(), std::string(one_layer->name()->c_str()));
        // 3.3.1 read network input information
        auto input_conn = one_layer->input_id();
        for (size_t n = 0; n < input_conn->size(); n++){
            auto one_conn = input_conn->Get(n);
            l.add_input(one_conn->tensor_id(), std::string(one_conn->name()->c_str()));
        }
        // 3.3.2 reading network output information
        auto output_conn = one_layer->output_id();
        for (size_t n = 0; n < output_conn->size(); n++){
            auto one_conn = output_conn->Get(n);
            l.add_output(one_conn->tensor_id(), std::string(one_conn->name()->c_str()));
        }
        // 3.3.3 reading network accessory information
        auto all_attrs = one_layer->attrs()->buffer();
        for (size_t n = 0; n < all_attrs->size(); n++){
            auto one_attr = all_attrs->Get(n);
            std::vector<uint8_t> a;
            for (size_t k = 0; k < one_attr->buffer()->size(); k++)
                a.push_back(one_attr->buffer()->Get(k));
            l.add_attr(std::string(one_attr->key()->c_str()), a);
        }
        this->rLayers.push_back(l);
    }
    delete data;
    return true;
}

In this way, the analysis of the model has been completed. In the future, we will officially start the preparation of articles such as the network composition of reasoning runtime and the preparation of opencl core. Look forward to it!

Keywords: AI neural networks Deep Learning

Added by Bullit on Tue, 15 Feb 2022 14:30:26 +0200