How does Go implement protobuf encoding and decoding: principle

Links to the original text: https://mp.weixin.qq.com/s/O8...

This is a companion article to analyze how Go implements protobuf encoding and decoding:

  1. How to Realize the Coding and Decoding of protobuf by Go (1): Principle
  2. How Go implements protobuf encoding and decoding (2): source code

This edition is the first one.

Introduction to Protocol Buffers

Protocol buffers, abbreviated as protobuf, is a markup language created by Google for serialization, project Github repository: https://github.com/protocolbu....

Protobuf is mainly used in collaborative RPC scenarios in different programming languages to define data formats that need to be serialized. Protobuf is essentially just a structural definition for interaction. It is not fundamentally different from other forms of interaction such as XML, JSON and so on. It is only responsible for defining and not responsible for data encoding and decoding.

Its official introduction is as follows:

Protocol buffers are Google's language-neutral, platform-neutral, extensible mechanism for serializing structured data – think XML, but smaller, faster, and simpler. You define how you want your data to be structured once, then you can use special generated source code to easily write and read your structured data to and from a variety of data streams and using a variety of languages.

Protocol buffers multilingual support

Protobuf supports a variety of programming languages, that is, the type data of various programming languages can be converted into the type data defined by protobuf, and the corresponding types of languages can be seen here. introduce.

Let's introduce the principle of protobuf's support for multilingualism. Protobuf has a program called protoc, which is a compiler responsible for compiling proto files into corresponding language files. It already supports C++, C#, Java, Python. For Go and Dart, plug-ins need to be installed to generate language files.

For C++, protoc can compile a.proto into a.pb.h and a.pb.cc.

For Go, protoc needs to use the plug-in protoc-gen-go to compile a.proto into a.pb.go, which contains defined data types, its serialization and deserialization functions, etc.

Knock on the blackboard. For the Go language, protoc is only responsible for compiling protoc files into Go language files by protoc-gen-go, not for serialization and deserialization. The serialization and deserialization operations in the generated Go language files are just wrapper.

Who is responsible for the serialization and deserialization of protobuf in Go?

Completed by github.com/golang/protobuf/proto, it is responsible for serializing the structure into proto data ([] byte), and de-serializing the proto data into Go structure.

OK, the principle part paves the way for this. Look at a simple example to understand the use of protoc and protoc-gen-go, as well as the serialization and deserialization operations.

A Hello World sample

According to the introduction above, we need to install two tools to use protobuf in Go: protoc and protoc-gen-go.

Install protoc and protoc-gen-go

First go Download Page Download protoc that fits your system. The sample version of this article is as follows:

➜  protoc-3.9.0-osx-x86_64 tree .
.
├── bin
│   └── protoc
├── include
│   └── google
│       └── protobuf
│           ├── any.proto
│           ├── api.proto
│           ├── compiler
│           │   └── plugin.proto
│           ├── descriptor.proto
│           ├── duration.proto
│           ├── empty.proto
│           ├── field_mask.proto
│           ├── source_context.proto
│           ├── struct.proto
│           ├── timestamp.proto
│           ├── type.proto
│           └── wrappers.proto
└── readme.txt

5 directories, 14 files

The installation steps of protoc are in readme.txt:

To install, simply place this binary somewhere in your PATH.

Add protoc-3.9.0-osx-x86_64/bin to PATH.

If you intend to use the included well known types then don't forget to
copy the contents of the 'include' directory somewhere as well, for example
into '/usr/local/include/'.

If you use a defined type, that is, the type in the include directory *. proto file above, copy the file under the include directory to / usr/local/include /.

Install protoc-gen-go:

go get –u github.com/golang/protobuf/protoc-gen-go

Inspection installation should be able to find the location of these two programs:

➜  fabric git:(release-1.4) which protoc
/usr/local/bin/protoc
➜  fabric git:(release-1.4) which protoc-gen-go
/Users/shitaibin/go/bin/protoc-gen-go

Hello world

Create a toy using protoc, project address Github: golang_step_by_step.

Its catalogue structure is as follows:

➜  protobuf git:(master) tree helloworld1
helloworld1
├── main.go
├── request.proto
└── types
    └── request.pb.go

Define proto file

Using proto3, define a Request, request.proto as follows:

// file: request.proto
syntax = "proto3";
package helloworld;
option go_package="./types";

message Request {
    string data = 1;
}
  • syntax: protobuf version, now proto3
  • Package: Not exactly equivalent to Go's package. It's better to set go_package separately and specify the package name of the go language file generated from the protoc file.
  • message: It compiles into Go's struct.

    • string data = 1: the member data representing request is string type, whose id is 1. protoc defines a number for each member. When encoding and decoding, the number is used instead of the member name to compress the amount of data.

Compile proto file

$ protoc --go_out=. ./request.proto

- go_out specifies that. / request.proto is compiled into a Go language file, which generates. / types/request.pb.go. Note the two methods XXX_Unmarshal and XXX_Marshal for the production of the Request structure. The contents of the file are as follows:

// file: ./types/request.pb.go
// Code generated by protoc-gen-go. DO NOT EDIT.
// source: request.proto

package types

import (
    fmt "fmt"
    math "math"

    proto "github.com/golang/protobuf/proto"
)

// Reference imports to suppress errors if they are not otherwise used.
var _ = proto.Marshal
var _ = fmt.Errorf
var _ = math.Inf

// This is a compile-time assertion to ensure that this generated file
// is compatible with the proto package it is being compiled against.
// A compilation error at this line likely means your copy of the
// proto package needs to be updated.
const _ = proto.ProtoPackageIsVersion3 // please upgrade the proto package

type Request struct {
    Data                 string   `protobuf:"bytes,1,opt,name=data,proto3" json:"data,omitempty"`
    // Here are the fields that protobuf automatically fills in. Protobuf needs to be used
    XXX_NoUnkeyedLiteral struct{} `json:"-"`
    XXX_unrecognized     []byte   `json:"-"`
    XXX_sizecache        int32    `json:"-"`
}

func (m *Request) Reset()         { *m = Request{} }
func (m *Request) String() string { return proto.CompactTextString(m) }
func (*Request) ProtoMessage()    {}
func (*Request) Descriptor() ([]byte, []int) {
    return fileDescriptor_7f73548e33e655fe, []int{0}
}

// deserialize
func (m *Request) XXX_Unmarshal(b []byte) error {
    return xxx_messageInfo_Request.Unmarshal(m, b)
}
// Serialization function
func (m *Request) XXX_Marshal(b []byte, deterministic bool) ([]byte, error) {
    return xxx_messageInfo_Request.Marshal(b, m, deterministic)
}
func (m *Request) XXX_Merge(src proto.Message) {
    xxx_messageInfo_Request.Merge(m, src)
}
func (m *Request) XXX_Size() int {
    return xxx_messageInfo_Request.Size(m)
}
func (m *Request) XXX_DiscardUnknown() {
    xxx_messageInfo_Request.DiscardUnknown(m)
}

var xxx_messageInfo_Request proto.InternalMessageInfo

// Get fields
func (m *Request) GetData() string {
    if m != nil {
        return m.Data
    }
    return ""
}

func init() {
    proto.RegisterType((*Request)(nil), "helloworld.Request")
}

func init() { proto.RegisterFile("request.proto", fileDescriptor_7f73548e33e655fe) }

var fileDescriptor_7f73548e33e655fe = []byte{
    // 91 bytes of a gzipped FileDescriptorProto
    0x1f, 0x8b, 0x08, 0x00, 0x00, 0x00, 0x00, 0x00, 0x02, 0xff, 0xe2, 0xe2, 0x2d, 0x4a, 0x2d, 0x2c,
    0x4d, 0x2d, 0x2e, 0xd1, 0x2b, 0x28, 0xca, 0x2f, 0xc9, 0x17, 0xe2, 0xca, 0x48, 0xcd, 0xc9, 0xc9,
    0x2f, 0xcf, 0x2f, 0xca, 0x49, 0x51, 0x92, 0xe5, 0x62, 0x0f, 0x82, 0x48, 0x0a, 0x09, 0x71, 0xb1,
    0xa4, 0x24, 0x96, 0x24, 0x4a, 0x30, 0x2a, 0x30, 0x6a, 0x70, 0x06, 0x81, 0xd9, 0x4e, 0x9c, 0x51,
    0xec, 0x7a, 0xfa, 0x25, 0x95, 0x05, 0xa9, 0xc5, 0x49, 0x6c, 0x60, 0xcd, 0xc6, 0x80, 0x00, 0x00,
    0x00, 0xff, 0xff, 0x2e, 0x52, 0x69, 0xb5, 0x4d, 0x00, 0x00, 0x00,
}

Writing Go Language Program

The following test program is the process of creating a request, serializing and deserializing.

// file: main.go
package main

import (
    "fmt"

    "./types"
    "github.com/golang/protobuf/proto"
)

func main() {
    req := &types.Request{Data: "Hello LIB"}

    // Marshal
    encoded, err := proto.Marshal(req)
    if err != nil {
        fmt.Printf("Encode to protobuf data error: %v", err)
    }

    // Unmarshal
    var unmarshaledReq types.Request
    err = proto.Unmarshal(encoded, &unmarshaledReq)
    if err != nil {
        fmt.Printf("Unmarshal to struct error: %v", err)
    }

    fmt.Printf("req: %v\n", req.String())
    fmt.Printf("unmarshaledReq: %v\n", unmarshaledReq.String())
}

Operation results:

➜  helloworld1 git:(master) go run main.go
req: data:"Hello LIB"
unmarshaledReq: data:"Hello LIB"

The above are all paving the way. How to realize the encoding and decoding of the proto package in the next section is the key point. The use of protobuf can be reversed.

  1. Official introduction: Introduction to protoc 3Introduction to CodingGo tutorial
  2. Fried fish grpc series articles

Reference Articles

  • https://tech.meituan.com/2015...
    "Serialization and Deserialization" is from the technical team of the United States Mission, which is worth reading.
  • https://github.com/golang/pro...
    Go supports protocol buffer warehouse, Readme, which is worth reading in detail.
  • https://developers.google.com...
    The Google language tutorial of Google Protocol Buffers is worth reading and practicing in detail.
  • https://developers.google.com...
    Overview of Google Protocol Buffers introduces what Protocol Buffers are, its principle, history (origin), and its comparison with XML, which is compulsory.
  • https://developers.google.com...
    Language Guide (proto3) this article introduces the definition of proto3, can also be understood as. proto file grammar, just like the grammar of Go language, do not know how to write. proto file grammar? Read this article will understand many principles, and you can step on the pit less, must read.
  • https://developers.google.com...
    This article "Go Generated Code" describes in detail how protoc uses. protoc to generate. pb.go, optional.
  • https://developers.google.com...
    Protocol Buffers Encoding introduces coding principles, optional.
  • https://godoc.org/github.com/...
    "package proto Document" can regard proto package as SDK for GoLanguage to operate protobuf data. It realizes the conversion of structure and protobuf data. It is used in conjunction with. pb.go file.

Keywords: Go Google github encoding

Added by danharibo on Mon, 09 Sep 2019 16:14:10 +0300