Beginners learn the basics of go -- the principle of Slice slice

preface

This paper mainly records:
1. Implementation principle of slice slice.
2. Whether the slice pointer is stored in the heap or stack.
3. Some pits in slicing.

1, Why slices?

Because the array in go is of value type, it is of fixed size when used. For example, the length of the array cannot be changed after arr: = [3] int {1,2,3}.
Therefore, golang designed Slice with dynamic array as the underlying data structure based on array. Dynamic array is a data structure that can dynamically add and delete elements to the array, and can be dynamically expanded when the capacity is insufficient.

2, How is slicing implemented?

1. Slice structure of go – SliceHeader

As shown in the figure below, go uses the SliceHeader struct to represent the dynamic array, which contains a uint pointer of Data to point to the address of the underlying dynamic array, Len represents the number of elements saved in the slice, and Cap represents the total capacity in the current array.

type SliceHeader struct {
	Data uintptr
	Len  int
	Cap  int
}

2. Two ways to initialize slices

Slice initialization can be divided into compile time initialization and run-time initialization:
Compile time initialization includes: 1) generating slices from the array; 2) generating slices using literal values

// Demonstrate go language slicing related operations
// slice initialization method 1: [] string {}
slice1 := []string{"java", "golang", "python", "php"}
fmt.Println(slice1) // [java golang python php]

// slice initialization method 2: it is very common to use the make() function!
slice2 := make([]string, 5)
fmt.Println(len(slice2)) // 5
fmt.Println(slice2) //[] the default is 5 empty strings

// slice is initialized in three ways: it is created through an array
data_arr := [5]int{2,4,6,8,10}
slice3 := data_arr[1:4]
fmt.Println(slice3) // [4 6 8]

In the way of initializing slices during compilation, typechecks such as the type and length of elements can be done in advance during compilation. Therefore, when we declare an array with a length of only 4 and want to obtain a slice with a length of 5 through the array, we can prompt errors during build compilation.

*The underlying principle of compiler initialization slice is actually as follows:

slice := []int{1,2,3}
//In fact, it will go through the following steps:
// Calculate the number of elements, and then initialize an array
var vstat [3]int
vstat[0] = 1
vstat[1] = 2
vstat[2] = 3
// Create a pointer to an array
var vauto *[3]int = new([3]int)
// The array pointer points to the above array
*vauto = vstat
// The slice variable declared by the user points to the array, that is, go back to the first step to initialize from the array
slice := vauto[:]

*There are two ways to initialize the runtime with the make function: make([]int, 5) and make([]int,5,6)
The first slice that specifies Len length but not Cap length generates an element with 5 int default values, and Cap is also a slice of 5.
The second generates a cap of 6, then the bottom layer will open up 6 spaces, but only 5 elements.

3. Does the slice allocate memory on the stack or on the heap?

This is related to:

  • 1. Related to the capacity of the slice.
    When the capacity of the slice is very small, the memory will be allocated directly on the stack. If it is very large, the memory will be allocated directly on the heap, which is similar to the array.

  • 2. Whether the variable of slice pointer has escaped
    We can use go build --gcflags to observe the variable memory allocation process:

// go build -gcflags '-m -l' array.go
// ./array.go:5:2: moved to heap: s
// ./array.go:5:11: make([]int, 3, 4) escapes to heap

func test() *[]int {

   s := make([]int,3,4)

   return &s
}

//Further experiments show that the variable escape of s will not occur in the following scenarios
// ./array.go:5:11: make([]int, 3, 4) does not escape

func test() int {

   s := make([]int,3,4)

   return s[0]
}

// However, if the s slice is returned by the test method, it will still escape. This may be a scenario in which the compiler cannot determine whether it is externally referenced
// ./array.go:5:11: make([]int, 3, 4) escapes to heap
func test() []int {

   s := make([]int,3,4)

   return s
}

It is found that in the process of memory allocation, the variable s escapes to the heap for initialization, that is, the pointer of the slice is stored on the heap, and whether the slice itself is stored on the heap or stack, if the variables of the slice pointer escape, the slice will also be initialized on the heap, although its capacity is very small, it can be initialized on the stack.

Summary:
After learning, this is related to the context of the code fragment:
1) If the variable is explicitly referenced outside the function, memory will certainly be allocated on the heap
2) If the compile time compiler cannot determine whether it is externally referenced, it will also be allocated directly on the heap
3) However, if the slice is only used to take part of the value and return after initialization in the method, there will still be no escape.

3. Expansion of slice:

slice Capacity expansion
slice There are two points to pay attention to in capacity expansion:
1.Principle of capacity expansion 2.Capacity expansion mechanism
1.Principle of capacity expansion:
In fact, the storage space will be reapplied for capacity expansion, and the space length will be the same as that after capacity expansion cap,Then put the original len The length of data is copied to the new slice, and the original slice still exists.
2.Capacity expansion mechanism:
When cap Less than 1024, basically when slice Satisfied cap Later again append New data will be expanded to the original data cap Twice as much as
 When cap Greater than 1024, slice The capacity expansion method is the original cap Of 1.25 times,That is to open up more than 1 on the original basis/4 New space
// Declare an array
arr := [10]int{1,2,3,4,5,6,7,8,9,10}
// The first slice is generated from the array
sub_slice1 := arr[2:4]
// It is found that len is 2 and cap is 8
// This indicates that the slice will reuse its cap as storage space from the original array, and will not apply for new space to save space
fmt.Printf("len:%d, cap:%d \n", len(sub_slice1), cap(sub_slice1)) //len:2, cap:8

// Here is a simple 2x expansion
// First apply for an empty slice
empty_slice := make([]int,0)
fmt.Printf("len:%d, cap:%d \n", len(empty_slice), cap(empty_slice)) //len:0, cap:0
new_slice := append(empty_slice, 1)
fmt.Printf("len:%d, cap:%d \n", len(new_slice), cap(new_slice)) //len:1, cap:1
new_slice = append(new_slice, 2)
fmt.Printf("len:%d, cap:%d \n", len(new_slice), cap(new_slice)) //len:2, cap:2
new_slice = append(new_slice, 3)
new_slice = append(new_slice, 4)
fmt.Printf("len:%d, cap:%d \n", len(new_slice), cap(new_slice)) //len:4, cap:4
// The next expansion can be clearly seen twice
new_slice = append(new_slice, 5)
fmt.Printf("len:%d, cap:%d \n", len(new_slice), cap(new_slice)) //len:5, cap:8

3, What pits need to be paid attention to in the use of slices?

In actual development, I encountered pits using slices:
1. When generating many slices from an array, it should be noted that the bottom layer of each slice is actually the same array, and the update operation will affect the slice contents seen by other slices.
2. When we want to add and append slices in other methods, we must return a slice and receive it with variables to complete the addition and deletion of slices, otherwise it will be invalid.
3. The problem of large slice. The performance of large slice will be very low during capacity expansion or copy. It should be noted that once the append operation of large slice exceeds the limit when dynamically adding memory, the program will crash; And copy a large slice will be very slow.

summary

Slicing is a very common type of dynamic array in go development. In addition to simple application, understanding its principle will help us to complete the development more efficiently.
Because the native library of go language has no linked list, queue, stack and other higher-level data structures, we can realize the structure of queue and stack based on Slice (that is, dynamic array). We can have a deeper understanding of Slice in the process of adding, deleting, modifying and querying data structures.

Keywords: Go

Added by tkreinbring on Sun, 19 Dec 2021 17:52:54 +0200