Go Quiz: see the underlying principles and precautions of slice from the go interview questions

Interview questions

Recently, the author of Go 101 released 11 Go interview questions, which are very interesting. He plans to write a series to analyze each question in detail. Welcome your attention.

You can look at the following question about slice. Through this question, we can have an in-depth understanding of the characteristics and precautions of slice.

package main

import "fmt"

func main() {
    a := [...]int{0, 1, 2, 3}
    x := a[:1]
    y := a[2:]
    x = append(x, y...)
    x = append(x, y...)
    fmt.Println(a, x)
}
  • A: [0 1 2 3] [0 2 3 3 3]
  • B: [0 2 3 3] [0 2 3 3 3]
  • C: [0 1 2 3] [0 2 3 2 3]
  • D: [0 2 3 3] [0 2 3 2 3]

You can leave your answers in the comments area. There are several test sites for this question:

  1. What is the underlying data structure of slice? What is assigned to slice?
  2. : what is the relationship between the new slice and the original slice? What is the length and capacity of the new slice?
  3. What does append do behind the scenes?
  4. What is the expansion mechanism of slice?

analysis

Let's answer the above questions one by one.

The underlying data structure of slice

talk is cheap, show me the code. Source code of slice directly:

Slice is defined in Src / Runtime / slice Go line 15, source address: https://github.com/golang/go/....

Pointer is defined in Src / unsafe / unsafe Go line 184, source address: https://github.com/golang/go/....

type slice struct {
    array unsafe.Pointer
    len   int
    cap   int
}

type Pointer *ArbitraryType

slice is actually a structure type, which contains three fields, namely

  • Array: is a pointer to an array in which the sliced data is actually stored.
  • len: the length of the slice.
  • cap: the capacity of the slice, which indicates the maximum number of elements that can be stored in the slice. If it exceeds the existing capacity, it will be automatically expanded.

Therefore, assigning values to slice actually assigns values to these three fields in slice. It seems like a correct nonsense, but believe me, remembering this sentence can help you very clearly understand how the values of the three fields in the slice and the data of the underlying array pointed to by the slice change after the slice is modified.

: split operator

: the split operator has several characteristics:

  1. : you can intercept data from an array or slice, and the result is a new slice.
  2. The array pointer in the new slice structure points to the original array or the underlying array of the original slice. The length of the new slice is: the value on the right minus the value on the left, and the capacity of the new slice is the capacity of the original slice minus the value on the left.
  3. : if no number is written on the left, the default is 0. If no number is written on the right, the default is the length of the divided array or slice.

    a := make([]int, 0, 4) // The length of a is 0 and the capacity is 4
    b := a[:] // Equivalent to B: = a [0:0], the length of B is 0 and the capacity is 4
    c := a[:1] // Equivalent to C: = a [0:1], the length of B is 1 and the capacity is 4
    d := a[1:] // panic: runtime error: slice bounds out of range
    e := a[1:4] // e length 3, capacity 3
  4. : the value on the right of the split operator has an upper limit. There are two upper limits
  • If an array is partitioned, the upper limit is the length of the partitioned array.
  • If the slice is segmented, the upper limit is the capacity of the segmented slice. Note that this is different from the subscript operation. If the subscript index is used to access the slice, the maximum value of the subscript index is (length of slice - 1), not the capacity of the slice.

A picture is worth a thousand words. Let's explain the mechanism of slice segmentation through the following example.

The following figure shows slice structure, ptr represents array pointer, pointing to the underlying array, and len and cap are the length and capacity of slices respectively.

step1: we create a slice s through the code s: = make ([] byte, 5, 5). The length and capacity are 5. The structure is as follows:


Step 2: now split slice s, s2: = s [2:4], and get a new slice s2, with the following structure.

  • s2 also points to the underlying array of the original slice s, but the starting position is the position with the subscript index of 2.
  • The length len(s2) of s2 is 2, because s2: = s [2:4] only intercepts two elements with subscript indexes 2 and 3 of slice s.
  • The capacity cap(s2) of s2 is 3, because three elements can be stored from the array position pointed to by s2 to the end of the underlying array.
  • Because the length is 2, only s2[0] and s2[1] are valid index accesses. However, with a capacity of 3, s2[0:3] is a valid segmentation expression.

step3: segment slice s s3: = S2 [: cap (S2)] to obtain a new slice s3 with the following structure:

  • s3 points to the underlying array of slice s2, which is also the underlying array of s. the starting position pointed to is the starting position of s2, corresponding to the position where the index of the array is 2.
  • The length len(s3) of s3 is 3, because s3: = s2 [: cap (s2)] intercepts three elements of slice s2 with subscript indexes of 0, 1 and 2.
  • The capacity cap(s3) of s3 is 3, because three elements can be stored from the array position pointed to by s3 to the end of the underlying array.

Therefore, the new slice generated by dividing the array or slice still points to the original underlying array, and will not copy a copy of the elements of the original underlying array to the new memory space.

Because they point to the same memory space, the modification of the original array or slice will affect the value of the new slice after segmentation, and vice versa.

append mechanism

To understand the mechanism of append, look directly at the source code description.

// The append built-in function appends elements to the end of a slice. If
// it has sufficient capacity, the destination is resliced to accommodate the
// new elements. If it does not, a new underlying array will be allocated.
// Append returns the updated slice. It is therefore necessary to store the
// result of append, often in the variable holding the slice itself:
//    slice = append(slice, elem1, elem2)
//    slice = append(slice, anotherSlice...)
// As a special case, it is legal to append a string to a byte slice, like this:
//    slice = append([]byte("hello "), "world"...)
func append(slice []Type, elems ...Type) []Type
  • The append function returns a slice. Append adds a new element at the end of the original slice. This end is the end of the slice length, not the end of the slice capacity.

    func test() {
        a := make([]int, 0, 4)
        b := append(a, 1) // B = [1], the first element of the underlying array pointed to by a is 1, but the length and capacity of a remain unchanged
        c := append(a, 2) // The length of a is still 0, C = [2], and the first element of the underlying array pointed to by a becomes 2
        fmt.Println(a, b, c) // [] [2] [2]
    }
  • If the capacity of the original slice is enough to contain the newly added elements, the values of the three fields in the slice structure returned by the append function are:

    • The value of the array pointer field remains the same as that of the array pointer of the original slice, that is, whether append is the slice returned from the underlying array of the original slice or points to the underlying array of the original slice
    • The value of len length field increases accordingly. If N elements are added, the length will increase by N
    • cap capacity unchanged
  • If the capacity of the original slice is not enough to store the newly added elements of append, Go will first allocate a new memory with a larger capacity, then copy all the elements in the original slice, and finally add new elements to the new memory. The values of the three fields in the slice structure returned by the append function are:

    • The value of the array pointer field has changed. It no longer points to the underlying array of the original slice, but to a new memory space
    • The value of len length field increases accordingly. If N elements are added, the length will increase by N
    • cap capacity will increase

Note: append does not change the value of the original slice. The length and capacity of the original slice remain unchanged unless the return value of append is assigned to the original slice.

So the question is, according to what rule is the capacity of the new slice calculated?

slice capacity expansion mechanism

The expansion mechanism of slice changes with the iteration of Go version. At present, most of the statements on the Internet are as follows:

When the capacity of the original slice is less than 1024, the capacity of the new slice becomes twice that of the original slice; The original slice capacity exceeds 1024, and the new slice capacity becomes 1.25 times of the original.

Here we clearly tell you that this conclusion is wrong.

The source code of slice expansion is implemented in Src / Runtime / slice The growslice function in go, source code address: https://github.com/golang/go/....

The expansion implementation code of Go 1.18 is as follows. et is the element type in the slice, old is the original slice, and cap is equal to the length of the original slice + the number of new elements added to append.

func growslice(et *_type, old slice, cap int) slice {
    // ...
    newcap := old.cap
    doublecap := newcap + newcap
    if cap > doublecap {
        newcap = cap
    } else {
        const threshold = 256
        if old.cap < threshold {
            newcap = doublecap
        } else {
            // Check 0 < newcap to detect overflow
            // and prevent an infinite loop.
            for 0 < newcap && newcap < cap {
                // Transition from growing 2x for small slices
                // to growing 1.25x for large slices. This formula
                // gives a smooth-ish transition between the two.
                newcap += (newcap + 3*threshold) / 4
            }
            // Set newcap to the requested cap when
            // the newcap calculation overflowed.
            if newcap <= 0 {
                newcap = cap
            }
        }
    }

    var overflow bool
    var lenmem, newlenmem, capmem uintptr
    // Specialize for common values of et.size.
    // For 1 we don't need any division/multiplication.
    // For sys.PtrSize, compiler will optimize division/multiplication into a shift by a constant.
    // For powers of 2, use a variable shift.
    switch {
    case et.size == 1:
        lenmem = uintptr(old.len)
        newlenmem = uintptr(cap)
        capmem = roundupsize(uintptr(newcap))
        overflow = uintptr(newcap) > maxAlloc
        newcap = int(capmem)
    case et.size == goarch.PtrSize:
        lenmem = uintptr(old.len) * goarch.PtrSize
        newlenmem = uintptr(cap) * goarch.PtrSize
        capmem = roundupsize(uintptr(newcap) * goarch.PtrSize)
        overflow = uintptr(newcap) > maxAlloc/goarch.PtrSize
        newcap = int(capmem / goarch.PtrSize)
    case isPowerOfTwo(et.size):
        var shift uintptr
        if goarch.PtrSize == 8 {
            // Mask shift for better code generation.
            shift = uintptr(sys.Ctz64(uint64(et.size))) & 63
        } else {
            shift = uintptr(sys.Ctz32(uint32(et.size))) & 31
        }
        lenmem = uintptr(old.len) << shift
        newlenmem = uintptr(cap) << shift
        capmem = roundupsize(uintptr(newcap) << shift)
        overflow = uintptr(newcap) > (maxAlloc >> shift)
        newcap = int(capmem >> shift)
    default:
        lenmem = uintptr(old.len) * et.size
        newlenmem = uintptr(cap) * et.size
        capmem, overflow = math.MulUintptr(et.size, uintptr(newcap))
        capmem = roundupsize(capmem)
        newcap = int(capmem / et.size)
    }
    // ...
    return slice{p, old.len, newcap}
}

Newcap is the capacity after capacity expansion. First determine the size of the newcap according to the length and capacity of the original slice and the number of elements to be added, and then align the newcap to get the final newcap.

answer

Let's go back to the beginning of this article and analyze the execution results of each line of code line by line.

codeSlice corresponding results
a := [...]int{0, 1, 2, 3}A is an array with a length of 4 and a value of [0 1 2 3]
x := a[:1]x is a slice. The pointer in the slice points to the first element of array a, the value is [0], the length is 1, and the capacity is 4
y := a[2:]y is a slice. The pointer in the slice points to the second element of array a. the value is [2,3], length 2 and capacity 2
x = append(x, y...)The remaining capacity of X is 3, which is enough to store 2 elements in Y, so x will not expand. The value of X is [0, 2, 3], length 3 and capacity 4. Because x, a and y all point to the same memory space, the modification of X affects a and y.
The value of a becomes [0 2 3 3], length 4, capacity 4
The value of y becomes [3], length 2, capacity 2
x = append(x, y...)The remaining capacity of X is only 1, which is not enough to store 2 elements in Y, so it needs to be expanded. The result of append(x, y) is to get a new slice, with a value of [0 2 3 3], a length of 5 and a capacity of 8.
The return value of append is assigned to x, so slice x will point to the new memory after expansion.
fmt.Println(a, x)The value of a is still [0 2 3 3], so the print result is [0 2 3 3] [0 2 3 3 3], and the answer is B

Meal addition: copy mechanism

Go's built-in function copy can copy elements from one slice to another. The source code is defined in Src / builtin / builtin Go, the code is as follows:

// The copy built-in function copies elements from a source slice into a
// destination slice. (As a special case, it also will copy bytes from a
// string to a slice of bytes.) The source and destination may overlap. Copy
// returns the number of elements copied, which will be the minimum of
// len(src) and len(dst).
func copy(dst, src []Type) int

copy copies min(len(dst), len(src)) elements from the original slice src to the target slice dst,

Because the number of copied elements min(len(dst), len(src)) will not exceed the length len(dst) of the target slice, the length and capacity of the target slice will not change after copy is executed.

Note: the memory space of the original slice and the target slice may overlap, and the value of the original slice may be changed after copy. Refer to the following example.

package main

import "fmt"

func main() {
    a := []int{1, 2, 3}
    b := a[1:] // [2 3]
    copy(a, b) // a and b memory spaces overlap
    fmt.Println(a, b) // [2 3 3] [3 3]
}

summary

For slice, always think about how the three fields in slice: pointer, length and capacity change after modifying slice.

  • Slice is a structure type, which contains three fields: array pointer to array, length len and capacity cap. Assigning a value to slice is to assign values to the pointer, length and capacity fields in slice respectively.
  • : the result of the split operator is a new slice. The array pointer in the new slice structure points to the original array or the underlying array of the original slice. The length of the new slice is: the value on the right minus the value on the left, and the capacity of the new slice is the capacity of the original slice minus the value on the left.
  • : the upper limit of the value on the right of the split operator has two conditions:

    • If an array is partitioned, the upper limit is the length of the partitioned array.
    • If the slice is segmented, the upper limit is the capacity of the segmented slice. Note that this is different from the subscript operation. If the subscript index is used to access the slice, the maximum value of the subscript index is (length of slice - 1), not the capacity of the slice.
  • For append and copy operations, the execution logic behind them should be clear.
  • When printing a slice, it is printed according to the length of the slice

    a := make([]int, 1, 4) // The length of a is 1 and the capacity is 4
    b := append(a, 1) // Add element 1 to the end of a, b = [0, 1], the length of a is still 1, and a and b point to the same underlying array
    fmt.Println(a, b) // [0] [0 1]
  • Go does not pass reference when passing parameters to a function, but only values. Some articles on the Internet write that go's slice, map and channel are passed and referenced as parameters, which is wrong. Please refer to my previous articles Does Go have reference variables and reference passing?

Open source address

The open source address of the article and sample code is in GitHub: https://github.com/jincheng9/...

Official account: coding advanced

Personal website: https://jincheng9.github.io/

Thinking questions

Leave 2 thinking questions. You are welcome to leave your answers in the comment area.

  • Topic 1:

    package main
    
    import "fmt"
    
    func main() {
        a := []int{1, 2}
        b := append(a, 3)
    
        c := append(b, 4)
        d := append(b, 5)
    
        fmt.Println(a, b, c[3], d[3])
    }
  • Topic 2

    package main
    
    import "fmt"
    
    func main() {
        s := []int{1, 2}
        s = append(s, 4, 5, 6)
        fmt.Println(len(s), cap(s))
    }

References

Keywords: Go data structure Back-end slice

Added by Perad on Wed, 29 Dec 2021 03:24:41 +0200