Jupyter notebook is used in this article, so numpy is only introduced at the beginning and not later. If it is run in other compilers, please ensure that numpy is introduced
1 create array
import numpy as np
1.1 using array() to import vectors
vector = np.array([1, 2, 3, 4]) vector
array([1, 2, 3, 4])
1.2 numpy.array() can also be used to import matrices
matrix = np.array([[1, 'Tim'], [2, 'Joey'], [3, 'Johnny']]) print(matrix)
[['1' 'Tim'] ['2' 'Joey'] ['3' 'Johnny']]
2 create Numpy array
2.1 create full 0 matrix
By default, the type created is float64
zeros = np.zeros(10) print(zeros) print(type(zeros))
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0.] <class 'numpy.ndarray'>
If you want to force a type when creating a Numpy matrix, we can use the following code
int_zeros = np.zeros(10, dtype=int) print(int_zeros) print(type(int_zeros))
[0 0 0 0 0 0 0 0 0 0] <class 'numpy.ndarray'>
2.2 create multidimensional matrix
A matrix with three rows and four columns is created, and its data type is float64
np.zeros(shape=(3, 4))
array([[0., 0., 0., 0.], [0., 0., 0., 0.], [0., 0., 0., 0.]])
2.3 create full 1 matrix
np.ones(shape=(3, 4)) # Here, you can omit shape =, that is, you can write it directly as follows: NP ones((3,4))
array([[1., 1., 1., 1.], [1., 1., 1., 1.], [1., 1., 1., 1.]])
2.4 create a matrix filled with specified values
Create a matrix with three rows and five columns. The default value is 121
np.full((3, 5), 121)
array([[121, 121, 121, 121, 121], [121, 121, 121, 121, 121], [121, 121, 121, 121, 121]])
2.5 generate a matrix within the specified range
The range method in python accepts three parameters, which are similar to the range method in python. The range method is also a front closed and back open method. The first parameter is the first value of the vector, the second parameter is the last value of the vector (because it is front closed and back open, the last value of the vector is actually the value minus the step size), and the third parameter is the step size, which is 1 by default,
np.arange(0, 20, 2)
array([ 0, 2, 4, 6, 8, 10, 12, 14, 16, 18])
2.6 divide the values of the specified range equally to generate a vector
Use NP Linespace method can divide the Numpy matrix equally, and the scope of this method is front closed and back closed
np.linspace(0, 10, 5)
array([ 0. , 2.5, 5. , 7.5, 10. ])
2.7 generating random number matrix
- Generate a vector with a length of 10, in which each value is a positive number between 0 and 10. Note that the range here is closed before opening
np.random.randint(0, 10, 10)
array([8, 8, 1, 8, 2, 4, 7, 3, 9, 2])
- If you are not sure what each parameter represents, you can add the parameter name size
np.random.randint(0, 5, size=5)
array([1, 2, 2, 0, 4])
- You can also generate a positive matrix with three rows and five columns
np.random.randint(4, 9, size=(3, 5))
array([[5, 7, 7, 4, 4], [4, 6, 7, 8, 4], [5, 7, 6, 7, 7]])
- You can also generate vectors or matrices of floating-point numbers between 0-1 without specifying the range
np.random.random(10)
array([0.06534131, 0.37071446, 0.02235879, 0.52019336, 0.21088465, 0.27516892, 0.07299309, 0.90930363, 0.38513079, 0.45422644])
np.random.random((2, 4))
array([[0.58483127, 0.38876978, 0.39987466, 0.75698103], [0.91688914, 0.49879013, 0.28552401, 0.61492315]])
- np.random.normal() represents a positive distribution. Normal here means normal. numpy. random. The meaning of normal (LOC = 0, scale = 1, size = shape) is as follows:
- Parameter loc(float): the mean value of normal distribution, corresponding to the center of this distribution. loc=0 indicates that this is a normal distribution with the Y axis as the axis of symmetry
- Parameter scale(float): the standard deviation of the normal distribution, corresponding to the width of the distribution. The larger the scale, the fatter the curve of the normal distribution, and the smaller the scale, the higher and thinner the curve
- Parameter size(int or integer tuple): the output value is assigned to the shape, and the default value is None
np.random.normal(loc=0, scale=1, size=5)
array([ 0.28137909, 0.44488236, -0.29643414, 1.06214656, -0.33401709])
3 get Numpy attribute
View the properties of Numpy array through shape
Pass ndim to get the dimension of Numpy array
a = np.arange(15) print(a.shape) print(a.ndim) a = a.reshape(3, 5) print(a.shape) print(a.ndim)
(15,) 1 (3, 5) 2
Special usage of reshape method
If we only care about how many rows or columns are needed, and the others are calculated by the computer itself, we can use the following method at this time
a.reshape(15, -1)
array([[ 0], [ 1], [ 2], [ 3], [ 4], [ 5], [ 6], [ 7], [ 8], [ 9], [10], [11], [12], [13], [14]])
a.reshape(-1, 15)
array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]])
4 Numpy array index
Numpy supports the positioning operation similar to list. In numpy, the array index also starts from 0
matrix = np.array([[1, 2, 3], [20, 30, 40]]) print(matrix) print(matrix[0][1])
[[ 1 2 3] [20 30 40]] 2
5 slice
Numpy supports slicing operations similar to list. For the following code
- print(matrix[:,1]) indicates that all rows are selected and the index of the column is 1
- print(matrix[:,0:2]) means that all rows are selected, and the index of the column is the data of 0 and 1
- print(matrix[1:3,:]) means that all columns are selected, and the indexes of rows are data of 1 and 3
- print(matrix[1:3,0:2]) represents all data whose indexes of the selected row are 1 and 2, and whose indexes of the column are 0 and 1
matrix = np.array([ [5, 10, 15], [20, 25, 30], [35, 40, 45] ]) print(matrix[:, 1]) print(matrix[:, 0:2]) print(matrix[1:3, :]) print(matrix[1:3, 0:2])
[10 25 40] [[ 5 10] [20 25] [35 40]] [[20 25 30] [35 40 45]] [[20 25] [35 40]]
6 matrix operation in numpy
Matrix operations (addition, subtraction, multiplication and division) in Numpy must be calculated in strict accordance with the formula, that is, the basic operations of the two matrices must have the same number of rows and columns.
6.1 addition and subtraction of matrix
myones = np.ones((3, 3)) myeye = np.eye(3) print(myones) temp = myones + myeye print('plus:') print(temp) temp = myones - myeye print('minus:') print(temp)
[[1. 1. 1.] [1. 1. 1.] [1. 1. 1.]] plus: [[2. 1. 1.] [1. 2. 1.] [1. 1. 2.]] minus: [[0. 1. 1.] [1. 0. 1.] [1. 1. 0.]]
6.2 point multiplication of matrix
The real multiplication of matrix must satisfy that the number of columns of the first matrix is equal to the number of rows of the second matrix, and the function of matrix multiplication is dot
matrix_1 = np.array([ [1, 2, 3], [4, 5, 6] ]) matrix_2 = np.array([ [1, 2], [3, 4], [5, 6] ]) print(matrix_1.shape[0] == matrix_2.shape[1]) print(matrix_1.dot(matrix_2))
True [[22 28] [49 64]]
6.3 transpose of matrix
Transpose of matrix refers to changing the rows in the matrix into columns
a = np.array([ [1, 2, 3], [4, 5, 6] ]) print(a.shape) print(a.T.shape)
(2, 3) (3, 2)
6.4 inverse of matrix
Numpy. Is required linalg. Inv function. The condition of matrix inversion is that the number of rows and columns of the matrix must be the same, and the matrix is full rank
At the same time, the dot product of the original matrix and the inverse matrix is the identity matrix
A = np.array([ [0, 1], [2, 3] ]) invA = np.linalg.inv(A) print(A) print(invA) print(A.dot(invA))
[[0 1] [2 3]] [[-1.5 0.5] [ 1. 0. ]] [[1. 0.] [0. 1.]]
6.5 other preset functions
In addition to the functions mentioned above, Numpy also has many preset functions, which can be used to act on each element in the matrix
Matrix function | explain |
---|---|
np.sin(a) | Take sine, sin(x), for each element in matrix a |
np.cos(a) | Take cosine for each element in matrix a, cos(x) |
np.tan(a) | Take the tangent of each element in matrix a, tan(x) |
np.sqrt(a) | Root each element in matrix a, sqrt(x) |
np.abs(a) | Take the absolute value of each element in matrix a |
7 data type conversion
The Numpy ndarray data type can be set through the parameter dtype. You can also use the parameter astype to convert types, which is useful when processing files.
Note that the astype call will return a new array, that is, a backup of the original data
For example, the following is the code to convert String to float
In this example, if the String contains non numeric types, the conversion from String to float will report an error
vector = np.array(["1", "2", "3"]) print(vector) vector = vector.astype(float) print(vector)
['1' '2' '3'] [1. 2. 3.]
8 Numpy's statistical calculation method
Numpy has built-in many calculation methods, among which the most important methods and descriptions are as follows
- sum(): calculate the sum of matrix elements; The calculation result of the matrix is a one-dimensional array, and rows or columns need to be specified
- mean(): calculate the average value of matrix elements; The calculation result of the matrix is a one-dimensional array, and rows or columns need to be specified
- max(): calculate the maximum value of matrix elements; The calculation result of the matrix is a one-dimensional array, and rows or columns need to be specified
- median(): calculate the median of matrix elements
It should be noted that the data type used for these statistical methods must be int or float
matrix = np.array([ [5, 10, 15], [20, 10, 30], [35, 40, 45] ]) print(matrix.sum()) # Calculate the sum of all elements print(matrix[0].sum()) # Calculate the sum of the elements in the first row print(matrix.sum(axis=1)) # Calculate the sum of rows, and the results are displayed in the form of columns print(matrix.sum(axis=0)) # Calculate the sum of columns, and the results are displayed in rows
210 30 [ 30 60 120] [60 60 90]
9. arg operation in numpy
9.1 get the subscript of the maximum value in the array
The argmax function is used to find the subscript of the maximum value in an array; In short, it is the index position corresponding to the maximum number
maxIndex = np.argmax([1, 2, 4, 5, 2]) print('maxIndex = ', maxIndex)
maxIndex = 3
9.2 get the subscript of the smallest value in the array
minIndex = np.argmin([1, 3, 2, 10, 5]) print('minIndex = ', minIndex)
minIndex = 0
9.3 disorder array order
x = np.arange(15, 30) print(x) np.random.shuffle(x) print(x) sortXIdx = np.argsort(x) # Sort x from small to large and return the index value. Note that the state of array x is not modified here, but the subscript of the element that should be in a certain bit after sorting is given. You can understand it by comparing the output print(x) print(sortXIdx)
[15 16 17 18 19 20 21 22 23 24 25 26 27 28 29] [20 29 24 22 16 19 21 23 27 18 26 15 28 17 25] [20 29 24 22 16 19 21 23 27 18 26 15 28 17 25] [11 4 13 9 5 0 6 3 7 2 14 10 8 12 1]
The first element 9 after sorting represents the index address of 15 in the x-prime group, that is, x[9] should be the smallest element
10 FancyIndexing
It is easy to index a value in a vector, which can be taken by x[n].
However, what should I do if I want to retrieve more complex data, for example, when I need to return the third, fifth and eighth elements? The example code is as follows:
x = np.arange(15) ind = [3, 5, 8] print(x) print(x[ind])
[ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14] [3 5 8]
You can also construct a new two-dimensional matrix from a one-dimensional vector
The third line of the following code is difficult to understand. It means to construct a new two-dimensional matrix. The first line needs to take the elements with index 0 and index 2 in the x vector; The second line needs to take the element with index 1 and the element with index 3 in the x vector.
It can be understood by comparing the output
x = np.arange(15) np.random.shuffle(x) ind = np.array([[0, 2], [1, 3]]) print(x) print(x[ind])
[14 7 9 11 10 4 0 1 13 5 3 6 8 2 12] [[14 9] [ 7 11]]
For the two-dimensional matrix, it is also easy to use fancyIndexing for data retrieval. The example code is as follows:
x = np.arange(16) np.random.shuffle(x) x = x.reshape(4, -1) print(x) row = np.array([0, 1, 2]) col = np.array([1, 2, 3]) print(x[row, col]) # Equivalent to taking three points (0,1), (1,2), (2,3) print(x[1:3, col]) # It is equivalent to taking the corresponding columns of two rows, columns 1, 2 and 3 of the second and third rows
[[ 3 4 8 13] [15 5 11 12] [ 7 2 1 0] [ 9 10 14 6]] [ 4 11 0] [[ 5 11 12] [ 2 1 0]]
11 Numpy array comparison
Numpy has a powerful function of comparing arrays or matrices. After data comparison, boolean values will be generated. The example code is as follows. According to the results, we can see that for each value in the matrix, we compare it with 25 and get a boolean value
matrix = np.array([ [5, 10, 15], [20, 25, 30], [35, 40, 45] ]) m = (matrix == 25) print(m)
[[False False False] [False True False] [False False False]]
Let's take a more complex example:
In the following code, the output of print(second_column_25) is [false, true false]. First, matrix[:,1] represents the column with index 1 in all rows, i.e. [10,25,40]. Compare it with 25 to get the previous output.
print(matrix[second_column_25,:]) represents the row of data that returns the true value, that is [20,25,30]
matrix = np.array([ [5, 10, 15], [20, 25, 30], [35, 40, 45] ]) second_column_25 = (matrix[:, 1] == 25) print(second_column_25) print(matrix[second_column_25, :])
[False True False] [[20 25 30]]
If more than one condition is required, you can use the condition symbol to splice multiple conditions, where & represents' and 'and', | represents' or '