numpy array usage

1, Basic usage of Numpy array

1. Numpy is a Python scientific computing library used to quickly process arrays of arbitrary dimensions.

2. NumPy provides an N-dimensional array type ndarray, which describes a collection of "items" of the same type.

3. numpy.ndarray supports vectorization.

4. NumPy is written in c language, and the GIL is released at the bottom. Its operation speed on the array is no longer limited by the python interpreter.

2, Array in numpy:

The use of arrays in Numpy is very similar to lists in Python. The differences between them are as follows:

1. Multiple data types can be stored in a list. For example, a = [1,'a'] is allowed, while arrays can only store the same data type.

2. Arrays can be multidimensional. When all the data in multidimensional arrays are numerical types, they are equivalent to matrices in linear algebra and can operate on each other.

3, Create an array (np.ndarray object):

Numpy often deals with arrays, so the first step is to learn to create arrays. The data type of the array in numpy is called ndarray. Here are two ways to create:

1. Generated from lists in Python:

import numpy as np
a1 = np.array([1,2,3,4])
print(a1)
print(type(a1))

2. It is generated using np.arena. The usage of np.arena is similar to range in Python:

import numpy as np
a2 = np.arange(2,21,2)
print(a2)

3. Use np.random to generate an array of random numbers:

a1 = np.random.random(2,2) # Generates an array of random numbers with 2 rows and 2 columns
a2 = np.random.randint(0,10,size=(3,3)) # The element is a random array of 3 rows and 3 columns from 0 to 10

4. Use the function to generate a special array:

import numpy as np
a1 = np.zeros((2,2)) #Generate an array of 2 rows and 2 columns with all elements of 0
a2 = np.ones((3,2)) #Generate an array of 3 rows and 2 columns with all elements being 1
a3 = np.full((2,2),8) #Generate an array of 2 rows and 2 columns with all elements of 8
a4 = np.eye(3) #Generate a 3x3 matrix with element 1 and other elements 0 on the skew square

4, Common properties of ndarray:

ndarray.dtype:

Because the array can only store the same data type, you can get the data type of the elements in the array through dtype. The following are the common data types of ndarray.dtype:

data type describe Unique identifier
bool Boolean type (True or False) stored in one byte 'b'
int8 One byte size, - 128 to 127 'i1'
int16 Integer, 16 bit integer (- 32768 ~ 32767) 'i2'
int32 Integer, 32-bit integer (- 2147483648 ~ 2147483647) 'i4'
int64 Integer, 64 bit integer (- 9223372036854775808 ~ 9223372036854775807) 'i8'
uint8 Unsigned integer, 0 to 255 'u1'
uint16 Unsigned integer, 0 to 65535 'u2'
uint32 Unsigned integer, 0 to 2 * * 32 - 1 'u4'
uint64 Unsigned integer, 0 to 2 * * 64 - 1 'u8'
float16 Semi precision floating point number: 16 bits, sign 1 bit, index 5 bits, precision 10 bits 'f2'
float32 Single precision floating point number: 32 bits, sign 1 bit, exponent 8 bits, precision 23 bits 'f4'
float64 Double precision floating point number: 64 bits, sign 1 bit, index 11 bits, precision 52 bits 'f8'
complex64 Complex number, which represents the real part and imaginary part with two 32-bit floating-point numbers respectively 'c8'
complex128 Complex numbers, representing the real part and imaginary part with two 64 bit floating-point numbers respectively 'c16'
object_ python object 'O'
string_ character string 'S'
unicode_ unicode type 'U'

We can see that Numpy has many more types of values than Python's built-in, because Numpy is designed to efficiently process massive data. For example, if we want to store tens of billions of numbers, and these numbers are no more than 254 (within one byte), we can set dtype to int8, which can save memory space more than using int64 by default. Type related operations are as follows:

1. Default data type:

import numpy as np
a1 = np.array([1,2,3])
print(a1.dtype) 
# If it is a windows system, the default is int32
# If it is a mac or linux system, it is determined according to the system

2. Designate

dtype: 
import numpy as np
a1 = np.array([1,2,3],dtype=np.int64)
# Or a1 = np.array([1,2,3],dtype="i8")
print(a1.dtype)

3. Modification

dtype: 
import numpy as np
a1 = np.array([1,2,3])
print(a1.dtype) # In the window system, the default is int32
# Modify dtype below
a2 = a1.astype(np.int64) # astype does not modify the array itself, but returns the modified result
print(a2.dtype)

ndarray.size:

Gets the total number of elements in the array. For example, there is a two-dimensional array:

   import numpy as np
   a1 = np.array([[1,2,3],[4,5,6]])
   print(a1.size) #6 is printed because there are a total of 6 elements

ndarray.ndim:

The dimension of the array. For example:

   a1 = np.array([1,2,3])
   print(a1.ndim) # Dimension is 1
   a2 = np.array([[1,2,3],[4,5,6]])
   print(a2.ndim) # Dimension is 2
   a3 = np.array([[[1,2,3],[4,5,6]],[[7,8,9],[10,11,12]]])
   print(a3.ndim) # Dimension is 3

ndarray.shape:

The tuple of the dimension of the array. For example, the following code:

   a1 = np.array([1,2,3])
   print(a1.shape) # Output (3,), which means a one-dimensional array with 3 data

   a2 = np.array([[1,2,3],[4,5,6]])
   print(a2.shape) # Output (2,3), which means a binary array, 2 rows and 3 columns

   a3 = np.array([
       [
           [1,2,3],
           [4,5,6]
       ],
       [
           [7,8,9],
           [10,11,12]
       ]
   ])
   print(a3.shape) # Output (2,2,3), which means a three-dimensional array. There are two elements in total, and each element has two rows and three columns

   a44 = np.array([1,2,3],[4,5])
   print(a4.shape) # Output (2,), which means a4 is a one-dimensional array with a total of 2 columns
   print(a4) # Output [list([1, 2, 3]) list([4, 5]), where the outermost layer is an array and the inner layer is a Python list

In addition, we can also modify the dimension of the array through ndarray.reshape. The example code is as follows:

   a1 = np.arange(12) #Generate a one-dimensional array with 12 data
   print(a1) 

   a2 = a1.reshape((3,4)) #It becomes a two-dimensional array with 3 rows and 4 columns
   print(a2)

   a3 = a1.reshape((2,3,2)) #Into a three-dimensional array, a total of 2 blocks, each block is 2 rows and 2 columns
   print(a3)

   a4 = a2.reshape((12,)) # Change the two-dimensional array of a2 into a 12 column one-dimensional array
   print(a4)

   a5 = a2.flatten() # No matter how many dimensions a2 is, it will become a one-dimensional array
   print(a5)

Note that reshape does not modify the original array itself, but returns the modified result. If you want to modify the array itself directly, you can use resize instead of reshape.

ndarray.itemsize:

The size of each element in the array, in bytes. For example, the following code:

   a1 = np.array([1,2,3],dtype=np.int32)
   print(a1.itemsize) # Print 4, because each byte is 8 bits, 32 bits / 8 = 4 bytes

jupyter:


Keywords: Data Analysis numpy

Added by ricky spires on Sun, 07 Nov 2021 23:52:29 +0200