Go slicing and tips (latest version in 2022)

Hello, I'm Taibai. Today I'd like to talk to you about Slice of Go language and some of its skills.

Original address (illustrated in the original): mp.weixin.qq.com/s/pNfx7AtXg_SiapW...

Welcome to my first official account (which will share Go, micro services, architecture, open source projects, source analysis, etc.):

1 Preface

Slicing is one of the most used data structures in Go language. This article will introduce the creation, operation, expression and skills of slicing.

2 array

Slice type is an abstraction based on the array type of Go. Therefore, to understand slice, we must first understand array.

Let's start with a code:

    var a [4]int // Array type: defines the specified length and element type
    a[0] = 1     // The value of the first index of array a is set to 1
    i := a[0]
    fmt.Println(i)    // Output: 1, indicating that it has been set successfully
    fmt.Println(a[1]) // Output: 0, indicating that the array does not need explicit initialization, and its element itself is zero
    fmt.Println(a[4]) // Error reporting: invalid array index 4 (out of bounds for 4-element array)

(code from) Go Slices: usage and internals)

explain:

  • 1. In the code, var a [4]int completes the initialization of the array, representing an array composed of four integers.
  • 2. The default value is 0. Output 0 when we print a[1].
  • 3. Set the value of the array index. You can directly specify the index of the subscript for assignment. In the example, a[0] = 1.
  • 3. The index of Go array is the same as that of other languages. Starting from 0, the size of the array is fixed. In the above example, the maximum size is 4, so the maximum subscript index is 3. Therefore, when we access a[4], we will report an error that the array is out of bounds.

The array of Go is a value type. Unlike C language, array variables are pointers to the first element, so when we pass or assign array variables, we actually do the copy operation. For example, in the following example, a assigns a value to b and modifies the elements in a without affecting the elements in b:

    a := [2]string{"johnny", "Taibai Technology"} // The length can be changed to... `, The compiler automatically calculates the length of the array
    b := a
    a[0] = "xujiajun"
    fmt.Println(a) // Output: [xujiajun Taibai technology]
    fmt.Println(b) // Output: [johnny Taibai technology]

To avoid copying, you can pass a pointer to the array, for example:

func double(arr *[3]int) {
    for i, num := range *arr {
        (*arr)[i] = num * 2
    }
}
a := [3]int{1, 2, 3}
double(&a)
fmt.Println(a) // [2 4 6]

(code reference) Effective Go)

3 slice creation

Because arrays need a fixed length, they are often not very flexible, and slicing is more flexible and powerful. Slices contain references to the underlying array. If one slice is assigned to another slice, both refer to the same array.

Underlying data structure of slice (src/runtime/slice.go):

type slice struct {
    array unsafe.Pointer // Pointer to underlying array
    len   int  // Slice length
    cap   int  // Underlying array capacity
}

The type specification of the slice is [] T, where T is the type of the slice element. Unlike array types, slice types do not have a specified length.

There are many ways to create slices: variable declaration, slice literal, make creation, new creation, and interception from slices / arrays.

1. Variable declaration

    var s []byte

The default value of this declared slice variable is nil, and the default value of capacity and length is 0.

    var x []byte
    fmt.Println(cap(x))   // 0
    fmt.Println(len(x))   // 0
    fmt.Println(x == nil) // true

2. Slice literal

When you declare slices with literals, the syntax is very similar to declaring arrays with literals:

a := []string{"johnny", "Taibai Technology"} // This is a slicing declaration without specifying the length
b := [2]string{"johnny", "Taibai Technology"} // This is an array declaration, with more length specified

3. make create:

In addition to the above creation method, the built-in function make can be used to specify the length and capacity to create (specific function: func make([]T, len, cap) []T)

Where T represents the element type of the slice to be created. The make function takes the type, length, and optional capacity. When called, make allocates an array and returns a slice that references the array.

First look at the empty slice:

    y := make([]int, 0)
    fmt.Println(cap(y))   // 0
    fmt.Println(len(y))   // 0
    fmt.Println(y == nil) // false

In the following example, a slice with a length of 5 and a capacity equal to 5 is created

    var s []byte
    s = make([]byte, 5, 5) // Length and capacity are specified
    fmt.Println(s) // Output: [0]

When the capacity parameter is omitted, it defaults to the specified length:

s := make([]byte, 5)
fmt.Println(s) // The output is also [0 0 0]

The built-in len and cap functions check the length and capacity of the slice:

    fmt.Println(len(s)) // 5
    fmt.Println(cap(s)) // 5

4. Use new

The slice created with new is a nil slice.

    n := *new([]int)
    fmt.Println(n == nil) // true

5. Intercept from slice / array

Slices can be created based on arrays and slices. Slice expressions will be used here, which will be described in detail in 3.3.

    n := [5]int{1, 2, 3, 4, 5}
    n1 := n[1:]     // Intercept from n array
    fmt.Println(n1) // [2 3 4 5]
    n2 := n1[1:]    // Cut from n1 slice
    fmt.Println(n2) // [3 4 5]

The slice and the original array or slice share the underlying space. Then, in the above example, modifying the element of n2 will affect the original slice and array:

    n2[1] = 6 // Modifying n2 will affect the original slice and array
    fmt.Println(n1) // [2 3 6 5]
    fmt.Println(n2) // [3 6 5]
    fmt.Println(n)  // [1 2 3 6 5]

4 slicing operation

The built-in function append() is used to append elements to the slice.

    n := make([]int, 0)
    n = append(n, 1)                 // Add an element
    n = append(n, 2, 3, 4)           // Add multiple elements
    n = append(n, []int{5, 6, 7}...) // Add a slice
    fmt.Println(n)                   // [1 2 3 4 5 6 7]

During the append operation, if the slice capacity is insufficient, the expansion will be triggered. Follow the example above:

    fmt.Println(cap(n)) // Capacity equals 8
    n = append(n, []int{8, 9, 10}...)
    fmt.Println(cap(n)) // The capacity is equal to 16, and the capacity has been expanded

When the capacity is 8 at the beginning and slices [] int{8, 9, 10} are added later, the capacity becomes 16. We will talk about the detailed capacity expansion mechanism later.

5 slice expression

The Go language provides expressions for two slices:

  • 1. Simple expression
  • 2. Extended expression

1. Simple expression

Format of simple slice expression [low:high].

As shown in the following example, n is a slice. When this expression [1:4] is used to represent the left closed and right open [low, high] interval, a new slice is intercepted (the result in the example is [23,4]). After the slice is intercepted, the intercepted length is high low.

    n := []int{1, 2, 3, 4, 5, 6}
 fmt.Println(n[1:4])      // [2 3 4]

The start low and end index high of the slice expression are optional; They default to zero and the length of the slice:

n := []int{1, 2, 3, 4, 5, 6}
fmt.Println(n[:4])      // [1 2 3 4],: there is no value in the front, and the default is 0
fmt.Println(n[1:])       // [2 3 4 5 6],: there is no value after it. The default is the length of the slice

Boundary problem

  • 1. When n is the value relationship between low and high in array or string expression n[low:high]:

    0 <= low <=high <= len(n)

  • 2. When n is a slice, the maximum value of high in the expression n[low:high] becomes cap(n). The value relationship between low and high:

    0 <= low <=high <= cap(n)

If the above conditions are not met, an out of bounds panic will be sent.

Intercept string

Note that the string is intercepted and a new string is generated.

    s := "hello taibai"
    s1 := s[6:]
    fmt.Println(s1)                 // taibai
    fmt.Println(reflect.TypeOf(s1)) // string

2. Extended expression

The new slice produced by simple expression will share the underlying array with the original array or slice. Although copy is avoided, it will bring some risks.
In the following example, when the new n1 slice append adds an element, it overwrites the value of index position 4 of the original n, resulting in unexpected consequences for your program.

    n := []int{1, 2, 3, 4, 5, 6}
    n1 := n[1:4]
    fmt.Println(n)       // [1 2 3 4 5 6]
    fmt.Println(n1)      // [2 3 4]
    n1 = append(n1, 100) // Change the value of index position 4 of n from 5 to 100
    fmt.Println(n)       // [1 2 3 4 100 6]

Go 1.2 An expression that limits the capacity of new slices is provided in:

n[lowmax]

max represents the capacity of the newly generated slice. The capacity of the new slice is equal to max low. In the expression, the relationship between low, high and max is:

0 <= low <= high <= max <= cap(n)

Continue with the example just now. When calculating the capacity of n[1:4], use cap to get the value equal to 5, and use the extended expression n[1]5] , recalculate with cap to obtain a new capacity value (5-1) equal to 4:

    fmt.Println(cap(n[1:4])) // 5
    fmt.Println(cap(n[1:4:5])) // 4

About capacity

The length of n[1:4] is 3. It's easy to understand (4-1). Why is the capacity 5?

Because slice n[1:4] and slice n share the underlying space, its capacity is not equal to its length 3. According to the position where 1 is equal to index 1 (equal to value 2), there are five elements from value 2 to end element 6, so the capacity of n[1:4] is 5.

If append exceeds the length of the slice, a new slice will be produced and the original one will not be overwritten:

    n2 := n[1:4:5]         // Length equals 3 and capacity equals 4
    fmt.Printf("%p\n", n2) // 0xc0000ac068
    n2 = append(n2, 5)
    fmt.Printf("%p\n", n2) // 0xc0000ac068
    n2 = append(n2, 6)
    fmt.Printf("%p\n", n2) // Address changed, 0xc0000b8000

6 slicing techniques

Go introduced slicing techniques on Github's official Wiki SliceTricks In addition, this project Go Slice Tricks Cheat Sheet A series of diagrams are made based on SliceTricks, which is more intuitive. The map made by go slice tricks check sheet is missing, and I also added it

Note: the last update time of the official Wiki is 2022.1.22. The following is based on this version.

AppendVector

This is the operation of adding a slice, which we have described in the slice operation above.

a = append(a, b...)

Copy

Here we show three ways to write copy:

b := make([]T, len(a))
copy(b, a)

// Efficiency is generally slower than the above writing, but if there are more, they are more efficient
b = append([]T(nil), a...)
b = append(a[:0:0], a...)

// This implementation is equivalent to make+copy.
// But in Go tool chain v1 It's actually slower on the 16.
b = append(make([]T, 0, len(a)), a...)

Cut

Cut off the elements between slices [i,j):

a = append(a[:i], a[j:]...)

Cut(GC)

If the above Cut element is a pointer, there will be a memory leak, so we need to set nil for the deleted element and wait for GC.

copy(a[i:], a[j:])
for k, n := len(a)-j+i, len(a); k < n; k++ {
    a[k] = nil // or the zero value of T
}
a = a[:len(a)-j+i]

Delete

Delete element at index position i:

a = append(a[:i], a[i+1:]...)
// or
a = a[:i+copy(a[i:], a[i+1:])]

Delete(GC)

Delete element at index position i:

copy(a[i:], a[i+1:])
a[len(a)-1] = nil // or the zero value of T
a = a[:len(a)-1]

Delete without preserving order

Delete the element at index position i, put the last bit on index position i, and then delete the last element. In this way, there is no replication operation at the bottom.

a[i] = a[len(a)-1] 
a = a[:len(a)-1]

Delete without preserving order(GC)

In the above deletion operation, the element is a pointer type or structure pointer field, and there will be the last element that cannot be dropped by GC, resulting in leakage. Set the end element to nil and wait for GC.

a[i] = a[len(a)-1]
a[len(a)-1] = nil
a = a[:len(a)-1]

Expand

This is essentially a combination of multiple append operations.

a = append(a[:i], append(make([]T, j), a[i:]...)...)

Extend

Expand the original list with the new list

a = append(a, make([]T, j)...)

Filter (in place)

The following code demonstrates deleting Go slice elements in place:

n := 0
for _, x := range a {
    if keep(x) {
        a[n] = x
        n++
    }
}
a = a[:n]

Insert

a = append(a[:i], append([]T{x}, a[i:]...)...)

The second append will generate a new slice and a copy. You can use the following code to avoid the second copy:

s = append(s, 0 /* Add a 0 value first*/)
copy(s[i+1:], s[i:])
s[i] = x

InsertVector

The following code demonstrates the implementation of insertion vector (sequential container encapsulating dynamic size array):

a = append(a[:i], append(b, a[i:]...)...)
func Insert(s []int, k int, vs ...int) []int {
    if n := len(s) + len(vs); n <= cap(s) {
        s2 := s[:n]
        copy(s2[k+len(vs):], s[k:])
        copy(s2[k:], vs)
        return s2
    }
    s2 := make([]int, len(s) + len(vs))
    copy(s2, s[:k])
    copy(s2[k:], vs)
    copy(s2[k+len(vs):], s[k:])
    return s2
}

a = Insert(a, i, b...)

Push

a = append(a, x)

Pop

 x, a = a[len(a)-1], a[:len(a)-1]

Push Front/Unshift

a = append([]T{x}, a...)

Pop Front/Shift

x, a = a[0], a[1:]

2423423423

7 additional slicing techniques

Filtering without allocating

The following example demonstrates that in data filtering, b operates based on the original a storage space without regenerating a new storage space.

b := a[:0]
for _, x := range a {
    if f(x) {
        b = append(b, x)
    }
}

In order for unused storage after interception to be lost by GC, it needs to be set to nil:

for i := len(b); i < len(a); i++ {
    a[i] = nil // or the zero value of T
}

Reversing

Demonstration of reverse operation:

for i := len(a)/2-1; i >= 0; i-- {
    opp := len(a)-1-i
    a[i], a[opp] = a[opp], a[i]
}

There is another way:

for left, right := 0, len(a)-1; left < right; left, right = left+1, right-1 {
    a[left], a[right] = a[right], a[left]
}

Shuffling

Shuffle algorithm. The idea of the algorithm is to randomly extract a new number from the original array into the new array.

for i := len(a) - 1; i > 0; i-- {
    j := rand.Intn(i + 1)
    a[i], a[j] = a[j], a[i]
}

go1. Built in function after 10 Shuffle

Batching with minimal allocation

When batch processing large slices, this technique can be understood as follows:

actions := []int{0, 1, 2, 3, 4, 5, 6, 7, 8, 9}
batchSize := 3
batches := make([][]int, 0, (len(actions) + batchSize - 1) / batchSize)

for batchSize < len(actions) {
    actions, batches = actions[batchSize:], append(batches, actions[0:batchSize:batchSize])
}
batches = append(batches, actions)

// result:
// [[0 1 2] [3 4 5] [6 7 8] [9]]

In-place deduplicate (comparable)

To remove duplicates from an ordered array:

import "sort"

in := []int{3,2,1,4,3,2,1,4,1} // any item can be sorted
sort.Ints(in)
j := 0
for i := 1; i < len(in); i++ {
    if in[j] == in[i] {
        continue
    }
    j++
    // preserve the original data
    // in[i], in[j] = in[j], in[i]
    // only set what is required
    in[j] = in[i]
}
result := in[:j+1]
fmt.Println(result) // [1 2 3 4]

Move to front, or prepend if not present, in place if possible.

The following code demonstrates moving the specified element to the header:

// moveToFront moves needle to the front of haystack, in place if possible.
func moveToFront(needle string, haystack []string) []string {
    if len(haystack) != 0 && haystack[0] == needle {
        return haystack
    }
    prev := needle
    for i, elem := range haystack {
        switch {
        case i == 0:
            haystack[0] = needle
            prev = elem
        case elem == needle:
            haystack[i] = prev
            return haystack
        default:
            haystack[i] = prev
            prev = elem
        }
    }
    return append(haystack, prev)
}

haystack := []string{"a", "b", "c", "d", "e"} // [a b c d e]
haystack = moveToFront("c", haystack)         // [c a b d e]
haystack = moveToFront("f", haystack)         // [f c a b d e]

Sliding Window

The following implements sliding window output according to size:

func slidingWindow(size int, input []int) [][]int {
    // returns the input slice as the first element
    if len(input) <= size {
        return [][]int{input}
    }

    // allocate slice at the precise size we need
    r := make([][]int, 0, len(input)-size+1)

    for i, j := 0, size; j <= len(input); i, j = i+1, j+1 {
        r = append(r, input[i:j])
    }

    return r
}

func TestSlidingWindow(t *testing.T) {
    result := slidingWindow(2, []int{1, 2, 3, 4, 5})
    fmt.Println(result) // [[1 2] [2 3] [3 4] [4 5]]
}

summary

  • 1. Go slice is essentially a structure, which saves its length, the capacity of the underlying array and the pointer of the underlying array.
  • 2. There are various ways to create Go slices: variable declaration, slice literal, make creation, new creation, and interception from slices / arrays.
  • 3. The Go slice uses len() to calculate the length, cap() to calculate the capacity, and append() to add elements.
  • 4. Compared with arrays, Go slices are more flexible and have many skills. Because of their flexibility, they are prone to memory leakage, which needs attention.

Keywords: Go

Added by nrsh_ram on Tue, 01 Feb 2022 02:36:00 +0200