Array
Basic concepts
- It is a data structure composed of a collection of elements of the same type, which allocates a continuous piece of memory for storage
- The storage address corresponding to the element can be calculated by using the index of the element
Basic use of c + + array
#include <iostream> #include <vector> using namespace std; int main() { int a[10] = {0}; // Initialization is required, otherwise it will be an unknown value {1, 2, 0, 0, 4196048, 0, 4195696, 0, - 8688, 32767} a[0] = 1; a[1] = 2; a[10] = 10; // c/c + + does not check the array out of bounds *(a + 10) = 11; // Create an array using new int* pA = new int[10]; pA[0] = 1; *(pA + 1) = 2; // After the memory is released, pA[0] is the initial value of 0 delete []pA; // Don't leave it blank. The local variable is released after it is finished return 0; }
C attaches great importance to the efficiency of the runtime and does not perform the array cross-border check, while C + + inherits the efficiency requirements of C and does not perform the array cross-border check If the data is out of bounds, the compiler must add additional code to the generated object code to detect whether the subscript is out of bounds when the program runs, which will reduce the running speed of the program
First of all, we should be clear that the essence of accessing an array is to access a unit in a continuous memory. As long as the memory of this unit is available, the program will not crash. The reason for the unavailability of memory is usually the memory protection mechanism of the operating system, that is, the program will crash because it accesses the memory not allocated to it. (artificial restriction)
In addition, from the perspective of assembly implementation, although the declared array has capacity display, the assembly operation is still based on the address offset without additional inspection:
7 int a[10] = {0}; 0x00000000004006f5 <+8>: mov QWORD PTR [rbp-0x30],0x0 //Array variable a is no different from ordinary variables, but an address on the stack 0x00000000004006fd <+16>: mov QWORD PTR [rbp-0x28],0x0 0x0000000000400705 <+24>: mov QWORD PTR [rbp-0x20],0x0 0x000000000040070d <+32>: mov QWORD PTR [rbp-0x18],0x0 0x0000000000400715 <+40>: mov QWORD PTR [rbp-0x10],0x0 8 9 a[0] = 1; 0x000000000040071d <+48>: mov DWORD PTR [rbp-0x30],0x1 10 a[1] = 2; 0x0000000000400724 <+55>: mov DWORD PTR [rbp-0x2c],0x2 11 12 a[10] = 10; // c/c + + does not check the array out of bounds 0x000000000040072b <+62>: mov DWORD PTR [rbp-0x8],0xa
java array basic use
public static void main(String[] args) { // Static initialization, the compiler determines the length according to the number of elements int[] a = new int[] { 0, 1, 2, 3, 4 }; System.out.println(Arrays.toString(a)); // dynamic initialization int[] b = new int[10]; b[0] = 1; b[1] = 2; // b[10] = 10; // The ArrayIndexOutOfBoundsException runtime checks whether the array is out of bounds, and it cannot be detected at compile time System.out.println(Arrays.toString(b)); }
java array out of bounds will be checked at runtime, which can be checked through jdk1 8-byte code discovery details:
Classfile /root/work/vscode_prj/java/JavaDemo.class Last modified Feb 28, 2022; size 527 bytes MD5 checksum 39d8114753bae0b7d44aa6cb28d840c7 Compiled from "JavaDemo.java" public class JavaDemo minor version: 0 major version: 52 // edition flags: ACC_PUBLIC, ACC_SUPER // public Constant pool: #1 = Methodref #6.#15 // java/lang/Object."<init>":()V { public JavaDemo(); descriptor: ()V flags: ACC_PUBLIC Code: stack=1, locals=1, args_size=1 0: aload_0 1: invokespecial #1 // Method java/lang/Object."<init>":()V 4: return LineNumberTable: line 4: 0 // The position corresponding to line 4 of the code is: 0 public static void main(java.lang.String[]); descriptor: ([Ljava/lang/String;)V flags: ACC_PUBLIC, ACC_STATIC Code: stack=4, locals=3, args_size=1 ... 34: bipush 10 // int[] b = new int[10]; Push byte 36: newarray int 38: astore_2 // Store reference into local variable 39: aload_2 // b[0] = 1; // Load reference from local variable 40: iconst_0 // Push int constant , push 0 41: iconst_1 // push 1 42: iastore // Store into int array (I = > int a = > array) will check the boundary and throw exceptions when storing 43: aload_2 // b[1] = 2; 44: iconst_1 45: iconst_2 46: iastore 47: return LineNumberTable: line 7: 0 line 8: 24 line 11: 34 // int[] b = new int[10]; line 12: 39 // b[0] = 1; line 13: 43 // b[1] = 2; line 17: 47 } SourceFile: "JavaDemo.java"
From the current version jdk1 In the bytecode of 8, there is no check on the array boundary, but you can see the operation of arraylength and iaload to obtain the length of the array from the web article??
Different versions of jvm have different implementations. The operation used in the current version is iastore
If arrayref is null, iastore throws a NullPointerException.
Otherwise, if index is not within the bounds of the array referenced by arrayref, the iastore instruction throws an ArrayIndexOutOfBoundsException.
vscode uses the JVM Bytecode Viewer plug-in in Right click the class file and select View bytecode. idea click View > > show bytecode in the menu bar, and use the jclasslib plug-in to view the bytecode more clearly
go array basic usage
1: package main 2: 3: import "fmt" 4: 5: func main() { 6: var a [10]int // var a []int is an empty array => 7: a[0] = 1 8: a[1] = 2 9: fmt.Println(a) 10: }
Check the assembly language of lines 7 and 8 through dlv:
godemo.go:7 0x10cbaf3 48c744243001000000 mov qword ptr [rsp+0x30], 0x1 godemo.go:8 0x10cbafc 48c744243802000000 mov qword ptr [rsp+0x38], 0x2
It can be seen from the assembly implementation that there is no boundary check at run time, and the array of go is checked at compile time, which can be viewed go source code
// InvalidIndex occurs when an index argument is not of integer type, // negative, or out-of-bounds. // // Example: // var s = [...]int{1,2,3} // var x = s[5] // // Example: // var s = []int{1,2,3} // var _ = s[-1] // // Example: // var s = []int{1,2,3} // var i string // var _ = s[i] InvalidIndex
Dynamic array
- In actual programming, it often happens that the required memory space depends on the actual input data and cannot be determined in advance.
- The length of static array is pre-defined. In the whole program, once the size is given, it cannot be changed. Dynamic arrays, on the other hand, can be resized as needed by the program.
C + + dynamic array
A Vector is a Sequence Container that encapsulates dynamic size arrays
Like any other type of container, it can hold various types of objects. It can be simply considered that a vector is a dynamic array that can store any type.
#include <iostream> #include <vector> using namespace std; int main() { vector<int> vi = {1,2,3,4,5,6,7,8}; cout << "size:" << vi.size() << ",cap:" << vi.capacity() << "" << endl; vi.push_back(9); cout << "size:" << vi.size() << ",cap:" << vi.capacity() << "" << endl; vector<int>::iterator it; for (it = vi.begin(); it != vi.end(); it++) { cout << *it << " "; } return 0; }
Output log:
size:8,cap:8 size:9,cap:16 # Double capacity change 1 2 3 4 5 6 7 8 9
From the perspective of stl implementation, the expansion is based on double expansion. The expansion strategy implemented by Microsoft is:_ Capacity = _Capacity + _Capacity / 2
In addition, it can be seen from the creation and assembly implementation of dynamic array that it is related to space allocator:
vector<int> vi = {1,2,3,4,5,6,7,8}; 0x0000000000400c0a <+13>: lea rax,[rbp-0x35] 0x0000000000400c0e <+17>: mov rdi,rax 0x0000000000400c11 <+20>: call 0x400e70 <std::allocator<int>::allocator()> 0x0000000000400c16 <+25>: mov r12d,0x401f40 0x0000000000400c1c <+31>: mov r13d,0x8 0x0000000000400c22 <+37>: lea rdi,[rbp-0x35] 0x0000000000400c26 <+41>: mov rcx,r12 0x0000000000400c29 <+44>: mov rbx,r13 0x0000000000400c2c <+47>: mov rax,r12 0x0000000000400c2f <+50>: mov rdx,r13 0x0000000000400c32 <+53>: mov rsi,rcx 0x0000000000400c35 <+56>: lea rax,[rbp-0x50] 0x0000000000400c39 <+60>: mov rcx,rdi 0x0000000000400c3c <+63>: mov rdi,rax 0x0000000000400c3f <+66>: call 0x400efe <std::vector<int, std::allocator<int> >::vector(std::initializer_list<int>, std::allocator<int> const&)> 0x0000000000400c44 <+71>: lea rax,[rbp-0x35] 0x0000000000400c48 <+75>: mov rdi,rax 0x0000000000400c4b <+78>: call 0x400e8a <std::allocator<int>::~allocator()>
Java dynamic array
The underlying data structure of the Java dynamic array ArrayList depends on the array:
/** * The array buffer into which the elements of the ArrayList are stored. * The capacity of the ArrayList is the length of this array buffer. Any * empty ArrayList with elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA * will be expanded to DEFAULT_CAPACITY when the first element is added. */ transient Object[] elementData; // non-private to simplify nested class access
The capacity expansion strategy is int newcapacity = oldcapacity + (oldcapacity > > 1), that is, newCap = oldCap + (oldCap / 2), which is the same as the C++ Vector capacity expansion strategy
private void grow(int minCapacity) { // overflow-conscious code int oldCapacity = elementData.length; int newCapacity = oldCapacity + (oldCapacity >> 1); if (newCapacity - minCapacity < 0) newCapacity = minCapacity; if (newCapacity - MAX_ARRAY_SIZE > 0) newCapacity = hugeCapacity(minCapacity); // minCapacity is usually close to size, so this is a win: elementData = Arrays.copyOf(elementData, newCapacity); }
Example code:
import java.lang.reflect.Array; import java.util.ArrayList; // java demo public class JavaDemo { public static void main(String[] args) { ArrayList<Integer> al = new ArrayList<>(); al.add(1); al.add(2); al.add(3); System.out.println(al.size()); } }
Go dynamic array
Slicing in Go language can also be regarded as a dynamic array. With the continuous change of capacity, there will be strategies to expand and shrink the capacity
From the growslice, we can see the expansion logic. If the capacity is less than 1024, it will be expanded to twice. If the capacity is greater than 1024, the capacity will be programmed with newcap += newcap / 4
func growslice(et *_type, old slice, cap int) slice { ... newcap := old.cap doublecap := newcap + newcap if cap > doublecap { newcap = cap } else { if old.cap < 1024 { newcap = doublecap } else { // Check 0 < newcap to detect overflow // and prevent an infinite loop. for 0 < newcap && newcap < cap { newcap += newcap / 4 } // Set newcap to the requested cap when // the newcap calculation overflowed. if newcap <= 0 { newcap = cap } } } ... }
Example code:
func main() { var s []int s = append(s, 1) s = append(s, 2) s = append(s, 3) s = append(s, 4) s = append(s, 5) s = append(s, 6) s = append(s, 7) s = append(s, 8) fmt.Println(len(s), cap(s)) s = append(s, 9) fmt.Println(len(s), cap(s)) }
Printout:
8 8 9 16