Zero-copy
In addition to the project, look at Gin's source code and see that the following two functions do not need to be copied. Then let's analyze how it is realized.
1. Basic data structure
// StringToBytes converts string to byte slice without a memory allocation. func StringToBytes(s string) (b []byte) { sh := *(*reflect.StringHeader)(unsafe.Pointer(&s)) bh := (*reflect.SliceHeader)(unsafe.Pointer(&b)) bh.Data, bh.Len, bh.Cap = sh.Data, sh.Len, sh.Len return b } // BytesToString converts byte slice to string without a memory allocation. func BytesToString(b []byte) string { return *(*string)(unsafe.Pointer(&b)) }
Let's look at the runtime data structure representation of byte and string
// StringHeader is the runtime representation of a string. // It cannot be used safely or portably and its representation may // change in a later release. // Moreover, the Data field is not sufficient to guarantee the data // it references will not be garbage collected, so programs must keep // a separate, correctly typed pointer to the underlying data. type StringHeader struct { Data uintptr Len int }
// SliceHeader is the runtime representation of a slice. // It cannot be used safely or portably and its representation may // change in a later release. // Moreover, the Data field is not sufficient to guarantee the data // it references will not be garbage collected, so programs must keep // a separate, correctly typed pointer to the underlying data. type SliceHeader struct { Data uintptr Len int Cap int }
For the above structure, please refer to: / go / SRC / reflect / value go
The Data type of Data is uintptr, and uintptr is an integer type, which is large enough to hold any pointer
// uintptr is an integer type that is large enough to hold the bit pattern of // any pointer. type uintptr uintptr
Since the data types of string and byte are relatively similar, only a Cap representation is added to the SliceHeader of byte.
Yes in read string. The difference between him and Slice is that Slice can grow dynamically. This is why Slice has an additional Cap attribute.
2. Performance difference
var x = "hello world" // Benchmark bytestostring zero copy func BenchmarkBytesToString(b *testing.B) { for i := 0; i <= b.N; i++ { _ = StringToBytes(x) } } //BenchmarkBytesToStringNormal native func BenchmarkBytesToStringNormal(b *testing.B) { for i := 0; i <= b.N; i++ { _ = []byte(x) } }
Use the above test cases to test the execution efficiency of the two methods, as follows:
go test -bench="." -benchmem
BenchmarkBytesToString-8 1000000000 0.3072 ns/op 0 B/op 0 allocs/op BenchmarkBytesToStringNormal-8 238276338 5.011 ns/op 0 B/op 0 allocs/op PASS ok hello/cmd 2.944s
Using forced type conversion, each operation takes 0.3072 nanoseconds, and there is no memory allocation
Using the [] byte() method of the system, each operation takes 5.011 nanoseconds, and there is no memory allocation
At this time, a question arises: why does the system's string to byte not generate memory allocation? Let's look at the source code processing of go
// The constant is known to the compiler. // There is no fundamental theory behind this number. const tmpStringBufSize = 32 type tmpBuf [tmpStringBufSize]byte func stringtoslicebyte(buf *tmpBuf, s string) []byte { var b []byte if buf != nil && len(s) <= len(buf) { *buf = tmpBuf{} b = buf[:len(s)] } else { b = rawbyteslice(len(s)) } copy(b, s) return b } // rawbyteslice allocates a new byte slice. The byte slice is not zeroed. func rawbyteslice(size int) (b []byte) { cap := roundupsize(uintptr(size)) p := mallocgc(cap, nil, false) if cap != uintptr(size) { memclrNoHeapPointers(add(p, uintptr(size)), cap-uintptr(size)) } *(*slice)(unsafe.Pointer(&b)) = slice{p, size, int(cap)} return }
From the above code, we can see that in the type conversion, runtime will be advanced in length judgment, if the length is less than len(buf), that is, the length of tmpStringBufSize, directly copy(b, s), the current tmpStringBufSize length is 32. When the string length is greater than 32, a memory request will be made and the rawbyteslice function will continue to be called. Calculate the memory size to be applied according to the length of the string. Therefore, when the string is too large, and then type conversion, it will cause frequent memory applications, and the performance will decline.
Now let's use the super large string to test, and then look at the test results
var x = `hello world hello world hello worldhello worldhello world hello worldhello worldhello worldhello world hello worldhello worldhello worldhello worldhello `
BenchmarkBytesToString-8 1000000000 0.3075 ns/op 0 B/op 0 allocs/op BenchmarkBytesToStringNormal-8 20812899 49.95 ns/op 160 B/op 1 allocs/op
As you can see, the performance has decreased almost tenfold, while the forced type conversion has not changed significantly.
3. Ask one more why?
Since the cast performance is so high, why do we often use the built-in method of the system? Or why not replace the system built-in with cast? Let's look at the following code:
// Mode 1 func main() { b := []byte("hello") b[1] = 'S' fmt.Println(string(b)) } // Mode 2 func main() { defer func() { err := recover() if err != nil { log.Println(err) } }() x := "hello" b := StringToBytes(x) b[1] = 'S' fmt.Println(x) }
As mentioned earlier, go's string is read-only by default. This is a similar setting in many languages, such as Rust, which has been popular in recent years. The co process of go is the highlight of this language, which can effectively reduce lock competition.
Using cast, it is illegal to change the contents of byte because it still shares a string pointer and the memory pointed to is read-only. The panic of the program will be triggered and cannot be recovered through recover.
Therefore, the built-in conversion is a way to be compatible with performance and security, reducing the mental burden of users. For performance sensitive applications, such as gateways. Or where frequent type conversion is required, this method can be used.
4. Keep digging?
In go language, is zero copy here the same as zero copy in operating system? Why zero copy is needed and how is zero copy implemented in the operating system? Which software and products have used zero copy, which has brought performance improvement. I originally planned to introduce them in this article. Limited to space, I'd better continue to introduce them in the next article.