Zero copy of string to byte

Zero-copy

In addition to the project, look at Gin's source code and see that the following two functions do not need to be copied. Then let's analyze how it is realized.

1. Basic data structure

// StringToBytes converts string to byte slice without a memory allocation.
func StringToBytes(s string) (b []byte) {
	sh := *(*reflect.StringHeader)(unsafe.Pointer(&s))
	bh := (*reflect.SliceHeader)(unsafe.Pointer(&b))
	bh.Data, bh.Len, bh.Cap = sh.Data, sh.Len, sh.Len
	return b
}

// BytesToString converts byte slice to string without a memory allocation.
func BytesToString(b []byte) string {
	return *(*string)(unsafe.Pointer(&b))
}

Let's look at the runtime data structure representation of byte and string

// StringHeader is the runtime representation of a string.
// It cannot be used safely or portably and its representation may
// change in a later release.
// Moreover, the Data field is not sufficient to guarantee the data
// it references will not be garbage collected, so programs must keep
// a separate, correctly typed pointer to the underlying data.
type StringHeader struct {
	Data uintptr
	Len  int
}
// SliceHeader is the runtime representation of a slice.
// It cannot be used safely or portably and its representation may
// change in a later release.
// Moreover, the Data field is not sufficient to guarantee the data
// it references will not be garbage collected, so programs must keep
// a separate, correctly typed pointer to the underlying data.
type SliceHeader struct {
	Data uintptr
	Len  int
	Cap  int
}

For the above structure, please refer to: / go / SRC / reflect / value go

The Data type of Data is uintptr, and uintptr is an integer type, which is large enough to hold any pointer

// uintptr is an integer type that is large enough to hold the bit pattern of
// any pointer.
type uintptr uintptr

Since the data types of string and byte are relatively similar, only a Cap representation is added to the SliceHeader of byte.

Yes in read string. The difference between him and Slice is that Slice can grow dynamically. This is why Slice has an additional Cap attribute.

2. Performance difference

var x = "hello world"

// Benchmark bytestostring zero copy
func BenchmarkBytesToString(b *testing.B) {
   for i := 0; i <= b.N; i++ {
      _ = StringToBytes(x)
   }
}

//BenchmarkBytesToStringNormal native
func BenchmarkBytesToStringNormal(b *testing.B) {
   for i := 0; i <= b.N; i++ {
      _ = []byte(x)
   }
}

Use the above test cases to test the execution efficiency of the two methods, as follows:

go test -bench="." -benchmem

BenchmarkBytesToString-8                1000000000               0.3072 ns/op          0 B/op          0 allocs/op
BenchmarkBytesToStringNormal-8          238276338                5.011 ns/op           0 B/op          0 allocs/op
PASS
ok      hello/cmd       2.944s

Using forced type conversion, each operation takes 0.3072 nanoseconds, and there is no memory allocation

Using the [] byte() method of the system, each operation takes 5.011 nanoseconds, and there is no memory allocation

At this time, a question arises: why does the system's string to byte not generate memory allocation? Let's look at the source code processing of go

// The constant is known to the compiler.
// There is no fundamental theory behind this number.
const tmpStringBufSize = 32

type tmpBuf [tmpStringBufSize]byte

func stringtoslicebyte(buf *tmpBuf, s string) []byte {
	var b []byte
	if buf != nil && len(s) <= len(buf) {
		*buf = tmpBuf{}
		b = buf[:len(s)]
	} else {
		b = rawbyteslice(len(s))
	}
	copy(b, s)
	return b
}


// rawbyteslice allocates a new byte slice. The byte slice is not zeroed.
func rawbyteslice(size int) (b []byte) {
	cap := roundupsize(uintptr(size))
	p := mallocgc(cap, nil, false)
	if cap != uintptr(size) {
		memclrNoHeapPointers(add(p, uintptr(size)), cap-uintptr(size))
	}

	*(*slice)(unsafe.Pointer(&b)) = slice{p, size, int(cap)}
	return
}

From the above code, we can see that in the type conversion, runtime will be advanced in length judgment, if the length is less than len(buf), that is, the length of tmpStringBufSize, directly copy(b, s), the current tmpStringBufSize length is 32. When the string length is greater than 32, a memory request will be made and the rawbyteslice function will continue to be called. Calculate the memory size to be applied according to the length of the string. Therefore, when the string is too large, and then type conversion, it will cause frequent memory applications, and the performance will decline.

Now let's use the super large string to test, and then look at the test results

var x = `hello world
hello world hello worldhello worldhello world
hello worldhello worldhello worldhello world
hello worldhello worldhello worldhello worldhello
`
BenchmarkBytesToString-8                1000000000               0.3075 ns/op          0 B/op          0 allocs/op
BenchmarkBytesToStringNormal-8          20812899                49.95 ns/op          160 B/op          1 allocs/op

As you can see, the performance has decreased almost tenfold, while the forced type conversion has not changed significantly.

3. Ask one more why?

Since the cast performance is so high, why do we often use the built-in method of the system? Or why not replace the system built-in with cast? Let's look at the following code:

// Mode 1
func main() {
	b := []byte("hello")
	b[1] = 'S'
	fmt.Println(string(b))
}

// Mode 2
func main() {
	defer func() {
		err := recover()
		if err != nil {
			log.Println(err)
		}
	}()
	x := "hello"
	b := StringToBytes(x)
	b[1] = 'S'
	fmt.Println(x)
}

As mentioned earlier, go's string is read-only by default. This is a similar setting in many languages, such as Rust, which has been popular in recent years. The co process of go is the highlight of this language, which can effectively reduce lock competition.

Using cast, it is illegal to change the contents of byte because it still shares a string pointer and the memory pointed to is read-only. The panic of the program will be triggered and cannot be recovered through recover.

Therefore, the built-in conversion is a way to be compatible with performance and security, reducing the mental burden of users. For performance sensitive applications, such as gateways. Or where frequent type conversion is required, this method can be used.

4. Keep digging?

In go language, is zero copy here the same as zero copy in operating system? Why zero copy is needed and how is zero copy implemented in the operating system? Which software and products have used zero copy, which has brought performance improvement. I originally planned to introduce them in this article. Limited to space, I'd better continue to introduce them in the next article.

Added by mdj on Mon, 07 Mar 2022 05:01:35 +0200