Go advanced learning notes

[TOC]

Go language principle

Basics

self-taught

Error

Error vs Exception

Error Type

Sentinel Error predefined error value

Sentinel errors Be you API Public part.

Sentinel errors A dependency is created between the two packages.

conclusion: Avoid as much as possible sentinel errors. 

Error types

type MyError struct {
    Msg string
    File string
    Line int
}

func (e *MyError) Error() string {
    return fmt.Sprintf("%s:%d: %s", e.File, e.Line, e.Msg)
}

func test() error {
    return &MyError{"Something happened", "server.go", 42}
}

func main() {
    err := test()
    switch err := err.(type) {
    case nil:
        // nothing
    case *MyError:
        fmt.Println("error occurred on line:", err.Line)
    default:
        // unknown
    }
}

The caller uses type assertions and types switch,Let the custom error Become public. This model will lead to strong coupling with the caller, resulting in API Become vulnerable.

The conclusion is to avoid it as much as possible error types,Although the error type ratio sentinel errors Better, because they can capture more context about the error, but error types share error values Many of the same problems.

Therefore, my advice is to avoid wrong types, or at least avoid them as public API Part of.

Opaque errors opaque error handling

inport "github.com/quux/bar"

func fn error {
    x, err := bar.Foo()
    if err != nil {
        return err
    }
}

Handing Error

Indented flow is for errors normal process code without errors will become a straight line rather than a compressed code

f, err := os.Open(path)
if err != nil {
    // handle error
}
// do stuff

Eliminate error handling by eliminating errors

What's wrong with the following code?

// Meaningless judgment
func AuthenticateRequest(r *Request) error {
    err := authenticate(r.User)
    if err != nil {
        return err
    }
    return nil
}

func AuthenticateRequest(r *Request) error {
    return authenticate(r.User)
}

Statistics io Number of rows read by reader

func CountLines(r io.Reader) (int, error) {
    var (
        br = bufio.MewReader(r)
        lines int
        err error
    )

    for {
        _, err = br.ReadString('\n')
        lines++
        if err != nil {
            break
        }
    }

    if err != io.EOF {
        return 0, err
    }

    return lines, nil
}

Improved version
func CountLines(r io.Reader) (int, error) {
    sc := bufio.NewScanner(r)
    lines := 0

    for sc.Scan() {
        lines++
    }

    return lines, sc.Err()
}

//

type Header struct {
    Key, Value string
}

type Status struct {
    Code int
    Reason string
}

func WriteResponse(w io.Writer, st Status, headers []Header, body io.reader) error {
    _, err := fmt.Fprintf(w, "HTTP/1.1 %d %s\r\n", st.Code, st.Reason)
    if err != nil {
        return err
    }

    for _, h := range headers {
        _, err := fmt.Fprintf(w, "%s: %s\r\n", h.Key, h.Value)
        if err != nil {
            return err
        }
    }

    if _, err := fmt.Fprint(w, "\r\n"); err != nil {
        return err
    }

    _, err = io.Copy(w, body)
    return err
}

// Improved version
type errWriter struct {
    io.Writer
    err error
}

func (e *errWriter) Write(buf []byte) (int, error) {
    if e.err != nil {
        return 0, e.err
    }

    var n int
    n, e.err = e.Writer.Write(buf)
    return n, nil
}

func WriteResponse(w io.Writer, st Status, headers []Header, body io.reader) error {
    ew := &errWriter{Writer: w}
    fmt.Fprintf(ew, "HTTP/1.1 %d %s\r\n", st.Code, st.Reason)

    for _, h := range headers {
        fmt.Fprintf(ew, "%s: %s\r\n", h.Key, h.Value)
    }

    fmt.Fprint(ew, "\r\n")
    io.Copy(ew, body)
    return ew.err
}

Wrap errors

Go 1.13 errors

%+v ,%w

Go 2 Error Inspection

Time node

error (II)

02:30 Q & A

02:48 example

Concurrency

Goroutine

Memory model

Package sync

chan

Package context

// 3.5.go
import {
    "context"
    "fmt"
    "time"
}

func main() {
    tr := NewTracker()
    go tr.Run()
    _ = tr.Event(context.Background(), "test")
    _ = tr.Event(context.Background(), "test")
    _ = tr.Event(context.Background(), "test")
    ctx, cancel := context.WithDeadline(context.Background(), time.Now().Add(2 * time.Second))
    defer ctx.cancel()
    tr.Shutdown(ctx)
}

type Tracker struct {
    ch chan string
    stop chan struct{}
}

func (t *Tracker) Event(ctx context.Context, data string) error {
    select {
        case t.ch <- data:
            return nil
        case <-ctx.Done():
            return ctx.Err()
    }
}

func (t *Tracker) Run() {
    for data := range t.ch {
        time.Sleep(1 * time.Second)
        fmt.Println(data)
    }
    t.stop <- struct{}{}
}

func (t *Tracker) Shutdown(ctx context.Context) {
    close(t.ch)
    select {
        case <-t.stop
        case <-ctx.Done()
    }
}

golang.org/ref/mem

www.jianshu.com/p/5e44168f47a3

Network programming

head of line blocking

goim + bullet chat

01:03 Persistent scenarios

Single queue CPU equilibrium

Size end

Sticky bag

Goim

Goim overview

Edge node

load balancing

Keep heartbeat alive

Process keeping alive
 Keep heartbeat alive
 Disconnection reconnection

User authentication and Session information

Comet

each Bucket Have independent Goroutine Read / write lock Optimization:
Buckets {
    channels map[string]*Channel
    rooms map[string]*Room
}

Logic

Job

Push pull combination

Push mode: when there is a new message, the server actively pushes it to the client;
Pull mode: the front end actively initiates the request to pull messages;
Push pull combination mode, real-time notification of new messages, and the client extracts new messages;

Read write diffusion

Read diffusion, in IM Reading diffusion in the system usually means that every two associated people have a mailbox, or every group has a mailbox.
Advantages: write operation (send message) is very light, only write your own mailbox
 Disadvantages: the read operation (message reading) is very heavy, and everyone's mailbox needs to be read
 Write diffusion, everyone only reads messages from their own mailbox, but everyone needs to write a copy when writing (sending messages)
Advantages: light reading operation
 Disadvantages: write operation is very heavy, especially for group chat

Unique ID design

UUID
 be based on Snowflake of ID Generation mode
 Based on Application DB Generation method of step size
 be based on Redis perhaps DB Self increment of ID Generation mode
 Special rules generate unique ID

IM private message system

Barrage system

Database design

1. Separate instance of important business, separate database
 Separate the barrage database instance from other service instances; Internal services separate important and non important businesses
2. Separate index and content
 Put the frequently changed data into the index table, such as status, attribute and barrage pool; Put the static data into the content table,
Such as bullet screen content, mode, font size and color;
3. Reduce the amount of data in a single table by dividing tables 
Change the table splitting strategy to follow cid The sub table, index table and content table are divided into 1000 tables respectively, from the original single table of 7000 w
 Up to 300 w
4. The contents of special barrage shall be stored in a separate table
5. Ensure as few candidate sets as possible through the top queue

Top queue design

Given that the candidate set of a single video barrage exceeds 200 w In this case, we implement a redis Top the queue to ensure that the candidate set is as small as possible,

redis sortset The implementation is as follows:
key :         Ordinary barrage, protective barrage, subtitle barrage, special barrage, recommended barrage
field:        fmt.Sprintf("%d.%d", Barrage attribute, bullet chat id)
member: bullet chat id

After each addition, the current queue length will be returned, which will exceed maxlimit The status of the barrage changed from normal to hidden

Every time you delete a bullet screen, you will get a bullet screen that has been hidden recently from the database, and change it from hidden to normal

Intelligent barrage

Smart barrage cache (old)

L1 cache redis,structure sortset
key:
Common barrage: i_type_oid_cnt_n
 Special barrage: s_type_oid
 Top barrage queue: q_oid
score: bullet chat id
member: bullet chat id

L2 cache mc,structure k-v
key: bullet chat id
value: Single article xml

Smart barrage cache (New)

L1 cache mc
 Save barrage for each video segment
 The length of time each video is cached

L2 cache redis,structure sortset
key:
Common barrage: i_type_oid_cnt_n
 Special barrage: s_type_oid
 Top barrage queue: q_oid
score: bullet chat id
member: bullet chat id

L2 cache redis,structure hashset
key: c_oid
field: bullet chat id
value: Single article xml

memcache concurrency

At present, this method is relatively low, and will be changed to single flight in the future;

single flight concurrent lock

####

01:01 For live barrage goim realization
 The barrage is very similar to the comment
 For fighting fish wesocket
goim In memory and gc A lot of optimization has been done

runtime

Goroutine principle

Goroutine

Difference between and thread

GMP scheduling model

G goroutine
M thread 
P Queue, and m binding

M When to create?
go func

Work stepping scheduling algorithm

Avoid hunger

overall situation->local->Try to steal->overall situation->Go fishing on the Internet

Syscall

Sysmon

Network poller

Trigger timing
sysmon
schedule
start the world

Memory allocation principle

Mcache - > mccentral - > mheap - > system memory (terminal)

Principle of garbage collection (GC)

Vernacular Go language memory management Trilogy (III) garbage collection principle

What every programmer should know about memory, Part 1

Architecture

Engineering practice

layout

v1 github.com/bilibili/kratos-demo

model <- dao <- service <- server

In our old layout, there are api, cmd, configs and internal directories under the app directory, and README, CHANGELOG and OWNERS are generally placed in the directory.

  • API: place the API definition (protobuf) and the corresponding generated client code, based on the swagger generated by pb json.
  • configs: put the configuration files required by the service, such as database yaml,redis.yaml,application.yaml.
  • Internal: it is to avoid someone in the same business referencing internal struct s such as internal model and dao across directories.
  • server: place the routing code of HTTP/gRPC and the code of DTO conversion.
    DTO(Data Transfer Object): data transfer object. This concept comes from the design pattern of J2EE. But here, it generally refers to the data transmission object between the presentation layer / API layer and the service layer (business logic layer).

The dependency path of the project is: Model - > Dao - > Service - > api. model struct connects all layers in series until the api needs DTO object conversion.

  • model: put the structure corresponding to the "storage layer", which is a one-to-one implicit reflection of storage.
  • dao: data read-write layer, database and cache are all handled in this layer, including cache miss processing.
  • service: combine various data accesses to build business logic.
  • server: depending on the service defined by proto as the input parameter, it provides a fast global method to start the service.
  • api: defines the API proto file and the generated stub code. The interface generated by it is implemented in the service.
  • Because the method signature of service implements the interface definition of API, DTO is directly used in the business logic layer, and dao is directly used to simplify the code.

DO(Domain Object): domain object is a tangible or intangible business entity abstracted from the real world. Lack of dto - > do object conversion.

v2 github.com/go-kratos/kratos-layout

There are api, cmd, configs and internal directories under the app directory. README, CHANGELOG and OWNERS are generally placed in the directory.

  • Internal: it is to avoid someone in the same business referencing internal struct s such as biz, data and service across directories.
  • biz: the assembly layer of business logic, the domain layer similar to DDD, and the repo interface of data similar to DDD. The repo interface is defined here and uses the principle of dependency inversion.
  • Data: business data access, including cache, db and other packages, and implements the repo interface of biz. We may confuse data with dao. Data focuses on the meaning of business. What it needs to do is to take out the domain objects again. We have removed the infra layer of DDD.
  • Service: it implements the service layer defined by api, which is similar to the application layer of DDD. It handles the transformation from DTO to biz domain entities (DTO - > do), and cooperates with various biz interactions, but it should not deal with complex logic.

PO(Persistent Object): a persistent object that forms a one-to-one mapping relationship with the data structure of the persistence layer (usually a relational database). If the persistence layer is a relational database, each field (or several fields) in the data table corresponds to one (or several) attributes of the PO. github.com/facebook/ent

start-up

github.com/go-kratos/kratos/blob/m...

Part II

00:41 Case

00:56 Error Case

service error -> gRpc error -> service error

01:24 example demo/v1/greeter_grpc.pb.go

01:33 the interface only updates some fields, and the fieldMask identification field needs to be updated

01:35 Google Design Guide cloud.google.com/apis/design?hl=zh...

01:50 Redis Config Case

Functional options

// Final edition
func main() {
  // load config file from yaml.
  c := new(redis.Config)
  _ = ApplyYAML(c, loadConfig())
  r, _ := redis.Dial(c.Network, c.Address, Options(c)...)
}

02:30 about testing

Basic requirements for unit test:

  • fast
  • Consistent environment
  • Arbitrary order
  • parallel

Implement a container dependency management scheme based on docker compose for cross platform and cross language environment to solve the problem of (mysql, redis, mc) container dependency in the scenario of running unittest:

  • Install Docker locally.
  • No intrusive environment initialization.
  • Quickly reset the environment.
  • Run anytime, anywhere (independent of external services).
  • Semantic API declaration resources.
  • Real external dependencies, not in-process simulations.
  1. Correctly perform health detection on the services in the container to avoid that the resources are not ready when unittest is started.
  2. The app should initialize the data itself, such as db scheme and initial sql data. In order to meet the consistency of the test, the container will be destroyed after each test.
  3. Before unit testing, import the encapsulated testing library to facilitate the startup and destruction of containers.
  4. For the unit test of service, use gomock and other libraries to remove dao mock, so when designing the package, it should be oriented to abstract programming.
  5. When executing Docker locally and Unittest in CI environment, you need to consider the Docker network in the physical machine or start a Docker again in Docker

Use the subtests + gomock officially provided by go to complete the whole unit test.

  • /api
    It is more suitable for integration testing, directly testing API, using API testing framework (such as yapi) and maintaining a large number of business test case s.
  • /data
    docker compose simulates the underlying infrastructure, so the abstraction layer of infra can be removed.
  • /biz
    Rely on repo and rpc client, and use gomock to simulate the implementation of interface to conduct business unit test.
  • /service
    Depending on the implementation of Biz, the implementation class of biz is constructed and passed in for unit testing.

Carry out feature development based on git branch, carry out unittest locally, submit gitlab merge request for CI unit test, build based on feature branch, complete function test, then merge master for integration test, and carry out regression test after going online.

Microservice availability design

quarantine

  • Service isolation

    • Dynamic and static isolation, read-write isolation
  • Light and heavy isolation

    • Core, speed and hotspot
  • Physical isolation

    • Thread, process, cluster, computer room

Timeout control

code: 504
. . .

01:22 head of line blocking

VDSO: blog.csdn.net/juana1/article/detai...

overload protection

code: 429

token-bucket rate limit algorithm: /x/time/rate

leaky-bucket rate limit algorithm: /go.uber.org/ratelimit

The cost of human operation and maintenance is too high. Adaptive algorithms are needed.

The peak throughput of the system near overload is calculated as the current limiting threshold for flow control to achieve system protection.
When the server approaches overload, it actively discards a certain amount of load, with the goal of self-protection.
Maintain the throughput of the system on the premise of system stability.
Common practice: Little's law

CPU and memory are throttled as semaphores.
Queue management: queue length, LIFO.
Controllable delay algorithm: CoDel. codel+bbr

How to calculate the system throughput near the peak?
CPU: Use a separate thread sampling every 250 ms Trigger once. When calculating the mean value, a simple moving average is used to remove the influence of the peak value.
Inflight: The number of requests in progress in the current service.
Pass&RT: Last 5 s,pass For every 100 ms Number of successful requests in the sampling window, rt Is the average response time in a single sampling window

[Cooling thought]
We use CPU Sliding mean of(CPU > 800(80%))As the heuristic threshold, once the trigger enters the overload protection stage, the algorithm is:(pass* rt) < inflight

After the current limiting effect takes effect, CPU Will be at the critical value(800)Nearby jitter, if the cooling time is not used, then a short time CPU The drop may lead to a large number of requests being released, which will be full in serious cases CPU. 

After the cooling time, re judge the threshold(CPU > 800),Whether the overload protection is continuously entered.

Overload protection: protect yourself and do every service

Current limiting

Flow limiting refers to the technology that defines how many requests a customer or application can receive or process over a period of time. For example, by limiting traffic, you can filter out customers and microservices that produce peak traffic, or you can ensure that your application will not be overloaded before auto scaling fails.

  • Token bucket and leaky bucket are for a single node and cannot be distributed to limit current.
  • QPS current limiting
    • Different requests may require a very different number of resources to process.
    • Some static QPS current limiting is not particularly accurate.
  • Set limits for each user
    • When the global overload occurs, some "exceptions" shall be controlled.
    • A certain degree of "oversold" quota.
  • Discard according to priority.
  • Rejecting requests also costs.

Distributed current limiting

Distributed current limiting is to control the global traffic of an application, rather than the latitude of a single node.

  • For a single high traffic interface, it is easy to generate hot spots using redis.
  • The pre request mode has a certain impact on the performance. High frequency network round-trip.

reflection:
Upgrade from getting a single quota to a batch quota. Quota: indicates the rate. After obtaining, the token bucket algorithm is used to limit it.

  • After each heartbeat, asynchronous batch acquisition of quota can greatly reduce the frequency of requesting redis. After acquisition, local consumption can be intercepted based on token bucket.
  • The quota applied for each time needs to be manually set, and the static value is slightly inflexible, such as 20 or 50 each time.

How to apply on demand based on a single node and avoid unfairness?
The default value is used for the first time. Once there is data in the past history window, you can make a quota request based on the history window data.

reflection:
We are often faced with the problem of dividing a group of users into rare resources. They all have equal rights to obtain resources, but some of them actually need less resources than other users.

So how do we allocate resources? A sharing technology widely used in practice is called "maximum minimum fair sharing"( Max-Min Fairness).

Intuitively, the minimum demand that each user wants to meet is fairly shared, and then the unused resources are evenly distributed to users who need 'big resources'.

The formal definition of the maximum minimum fair allocation algorithm is as follows:
Resources are allocated in the order of increasing demand.
There are no users who get more resources than they need.
Unsatisfied users share resources equivalently.

importance

Each interface is configured with a threshold, and the operation is heavy. The simplest way is to configure the service level quota. More finely, we can set the quota according to different importance. We introduce criticality:

  • Most important CRITICAL_PLUS, the type reserved for the final request. Rejecting these requests will cause very serious user visible problems.
  • CRITICAL, the default request type issued by the mo. Rejecting these requests can also cause user visibility problems. But it may not be that serious.
  • Discardable SHEDDABLE_PLUS these flows can tolerate some degree of unavailability. This is the default value for requests from batch tasks. These requests can usually be retried in a few minutes.
  • Discardable SHEDDABLE these traffic may often encounter partial unavailability and occasionally completely unavailability.

Between gRPC systems, importance information needs to be transferred automatically. If the back end receives request A and sends requests B and C to other back ends during processing, requests B and C will use the same importance attribute as A.

  • When the global quota is insufficient, the low priority will be rejected first.
  • Global quotas can be set according to their importance.
  • During overload protection, low priority requests are rejected first.

Fuse

Circuit breakers: in order to limit the duration of operation, we can use timeout, which can prevent pending operation and ensure that the system can respond. Because we are in a highly dynamic environment, it is almost impossible to determine the exact time limit for normal operation in every case. Circuit breakers are named after real-world electronic components because their behavior is the same. Circuit breakers are very useful in distributed systems because repeated failures may lead to snowball effect and crash the whole system.

  • There are a lot of errors in the resources that the service depends on.
  • When a user exceeds the resource quota, the back-end task will quickly reject the request and return the error of "insufficient quota", but the rejection reply will still consume some resources. It is possible that the back end is busy sending rejection requests, resulting in overload.

Google SRE
max(0, (requests - K*accepts) / (requests + 1))

github.com/go-kratos/kratos/blob/v...

Gutter Kafka

Client current limiting

github.com/go-kratos/kratos/blob/v...

02:10 DTO Case

02:35 code demonstration: wire dependency injection, elegant database implementation?

Demotion

Provision of damaging services

retry

load balancing

github.com/go-kratos/kratos/blob/v... (key points)

Review architecture design

Playback history architecture

Component part

Distributed cache & distributed transaction

Log & Index & Link Tracking

DNS & CDN & multi active architecture

kafka

Watermark mechanism of Flink

Byte hopping MQ hive real-time data integration based on Flink

Kafka basic concept

Storage principle: highly dependent on file system

Ultra high throughput achieved by sequential I/O and Page Cache

Keep all published massage s, regardless of whether they have been consumed or not

Topic & Partition

Offset: the offset of the current Partition

The same Topic has multiple different partitions

A Partition is the same directory

(3497) global messageno + 3rd message

Sparse index, reduce the size of index file and map to memory

.timeindex -> .index -> .log ?

Producer & Consumer

01:40 Producer

Consumer

Byte link scheme?

push vs pull

codel+ bbr

Leader & Follower

Lease system

Uneven Partition distribution (knapsack algorithm) The core is to try to be balanced

Data reliability

HW(high watermark),LEO(log end offset)

performance optimization

MAD algorithm

Keywords: Go

Added by cvincent on Wed, 02 Feb 2022 11:44:52 +0200