[python package] NumPy - fast data processing

catalogue

1, ndarray object

Disadvantages of the list:

Advantages of NumPy:

usage method:

  Multidimensional array

ndarray objects: shape, shape

Element type

Element type cast

Creating an array of ndarray s from a sequence

Creating ndarray with the from series method

Structure array

Mask array

Skills in using array subscripts

2, ufunc function

Arithmetic operator / comparison operator of ufunc

ufunc function velocity measurement

ufunc functions: Customizing

radio broadcast

3, Subscript access of multidimensional array

Skills in using array subscripts

4, NumPy file read / write

NumPy file read / write

1, ndarray object

The list in the python standard library can be used as an array to support dynamic memory allocation and garbage collection.

List elements can be any object, powerful!

Disadvantages of the list:

  1. Slow: there are various subscript checks and type checks during the cycle
  2. Occupy more memory: save the object + pointer

Advantages of NumPy:

  1. Two magic weapons: multidimensional array ndarray and general function ufunc
  2. Numerical calculation oriented and fast (built-in function approximation c language)

usage method:

NumPy officially provides rich Chinese resources

  1. import before use
  2. The import ed packages can be packages installed through conda or pip, or other x.py files in python's path.
# Import the package installed in python's default path with conda or pip
import numpy as np # Import a package named numpy, nicknamed np
import numpy     # Import a package named numpy
from numpy import array as ar   # Import array from numpy and nickname it ar
np = __import__("numpy") # If you know the name of the module, you can import it. The function form of import is used here__ import__(str) note the double underline
An example of dynamic drawing using a package:
# Create the data.
from numpy import pi, sin, cos, mgrid
dphi, dtheta = pi/250.0, pi/250.0
[phi,theta] = mgrid[0:pi+dphi*1.5:dphi,0:2*pi+dtheta*1.5:dtheta]
m0 = 4; m1 = 3; m2 = 2; m3 = 3; m4 = 6; m5 = 2; m6 = 6; m7 = 4;
r = sin(m0*phi)**m1 + cos(m2*phi)**m3 + sin(m4*theta)**m5 + cos(m6*theta)**m7
x = r*sin(phi)*cos(theta)
y = r*cos(phi)
z = r*sin(phi)*sin(theta)
# View it.
from mayavi import mlab
s = mlab.mesh(x, y, z)
mlab.show()

  Multidimensional array

Multidimensional array ndarray (n-dimensional array object) is the core object of NumPy

It stores multi-dimensional arrays of a single type. Note the difference from list s

  • Simple structure and powerful function
  • Using the optimized C API, the speed is fast

Constructor creation of ndarray object

array function creates ndarray

import numpy as np 
# Create directly with the constructor of the ndarray object
# Its parameter is the dimension of multidimensional array
a = np.ndarray([2,3,4])
print(type(a))
print(a.shape) # The dimension of a is obtained through the shape attribute, which is a tuple
print(a)
# Create ndarray using the array function
# Pass the python sequence object to the np.array() function and convert the sequence object into an ndarray object
a=np.array([1,2,3,4])
b=np.array((5,6,7,8))
c=np.array([[1,2,3,4],[4,5,6,7],[7,8,9,10]])
print(type(a))
print('a = ',a)
print('b = ',b)
print('c = ',c)

import numpy as np 
a = np.array([1.1, 2.0, 3.5])
print(a,type(a),a.dtype,a.shape)
b = np.array([1.1 + 2.5j, 2.0 + 5.1j, 3.5 + 2.7j])
print(b,type(b),b.dtype,b.shape)
c = np.array(['hello, world!','hello, numpy array!','I'll try Chinese'])
print(c,type(c),c.dtype,c.shape)
# You can also use the zeros, ones, empty and full functions to create an array with a specified size and a value of 0 / 1 / empty / specified value
zz=np.zeros((2,3,4))
oo=np.ones((2,3,4))
ee=np.empty((2,3,4))
ff=np.full((2,3,4),999)
print('zz = ', zz)
print('oo = ', oo)
print('ee = ', ee)
print('ff = ', ff)
# Empty only allocates memory, does not assign a value, and is the fastest. But the content is not necessarily! The ndarray created with empty must be initialized and reused.
# Create an array with the same shape type as a
za = np.zeros_like(a)
oa = np.ones_like(a)
ea = np.empty_like(a)
fa = np.full_like(a,999) 
print('za = ', za)
print('oa = ', oa)
print('ea = ', ea)
print('fa = ', fa)

ndarray objects: shape, shape

The shape of the array object is obtained through the shape attribute, and a tuple describing the length of each axis of the array is returned. The length of the tuple is equal to the dimension of the array

  • (3,4), indicating that the length of axis 0 is 3 and the length of axis 1 is 4 (three rows and four columns)
  • (2,3,4), indicating that the length of the 0th axis is 2, the length of the 1st axis is 3, and the length of the 2nd axis is 4

In objects of type ndarray, the data is stored in memory allocated continuously after being one-dimensional. The dimension of ndarray just tells numpy how to read

Change the shape attribute to change the shape of the array.

import numpy as np 
c = np.array([[[1,2,3,4],[5,6,7,8],[9,10,11,12]],\
              [[13,14,15,16],[17,18,19,20],[21,22,23,24]]])
print(c.shape)
print(c) # 2 pieces, 3 rows, 4 columns (the length of the 0th axis is 2, the length of the 1st axis is 3, and the length of the 2nd axis is 4)
# Change the shape of the array
c.shape = (2,4,3) # Note that this is not transpose!!! After changing the shape, the order of the data is unchanged.
print(c.shape)
print(c)
# Use - 1 to indicate that the length of this dimension is calculated automatically
c.shape = 3,-1
print(c.shape)
print(c)
import numpy as np 
# Use - 1 to indicate that the length of this dimension is calculated automatically
c = np.array([[[1,2,3,4],[5,6,7,8],[9,10,11,12]],\
              [[13,14,15,16],[17,18,19,20],[21,22,23,24]]])
print(c.shape)
print(c) # 2 pieces, 3 rows, 4 columns (the length of the 0th axis is 2, the length of the 1st axis is 3, and the length of the 2nd axis is 4)
# Use reshape to create a new array of specified shapes
d = c.reshape((2,3,4))
print('d.shape = ', d.shape)
print('d = ', d)
# The shape of c remains unchanged
print('c.shape = ', c.shape)
# If you change the element of d, the element of c will still change!
# copy.deepcopy() is recommended for full replication
c[0,0] = 2233
print('c = ', c)
print('d = ', d)
d = c.reshape((3,2,4))
print('d = ', d)

Element type

  • When reading and writing scientific data, the data type of ndarray's element is very important!
  • np arrays, like c language, have types, which can be viewed through the dtype attribute
  • You can specify the data type when creating an array
  • numpy supports a wider range of data types than the python standard library
import numpy as np 
c = np.array([[[1,2,3,4],[5,6,7,8],[9,10,11,12]],\
              [[13,14,15,16],[17,18,19,20],[21,22,23,24]]])
print(c.shape)
# Look at the type of ndarray c
print(c.dtype)
# What is the default data type for creating an array?
a = np.array([1,2,3,4])
print(a.dtype)
b = np.array([1.0, 2.0, 3.0, 4.0])
print(b.dtype)
c = np.zeros(4)
print(c.dtype)

Element type cast

import numpy as np 
# Specify the data type when creating an array
ai32 = np.array([1, 2, 3, 4], dtype=np.int32) 
af = np.array([1, 2, 3, 4], dtype=float)
ac = np.array([1, 2, 3, 4], dtype=complex) 
# Where np.int32 is the data type of numpy; float and complex are python built-in types, which will be automatically converted to numpy data types
print(ai32.dtype)
print(af.dtype)
print(ac.dtype)
# Use set(np.typeDict.values()) to view the types supported by numpy
set(np.typeDict.values()) 
import numpy as np 
# Type conversion
# np.int16 converts numeric values to int16 type in C, and the behavior is consistent with the corresponding type in C language
a=np.int16(200)
# The following sentence will lead to overflow. What is the result?
print(a*a)
# After the data is forcibly converted to numpy object, the operation speed is slow due to the python object, so it is not recommended to use it alone!
# For numpy objects, use the built-in function of numpy to improve the speed
v1 = 3.14
v2 = np.float64(v1)
print('python built-in float Type speed measurement:')
%timeit v1*v1
print('numpy built-in float64 Type speed measurement:')
%timeit v2*v2
# Type conversion of array
t1 = np.array([1, 2, 3, 4], dtype=np.float) 
t2 = np.array([1, 2, 3, 4], dtype=np.complex) 
# Use the astype method to type convert the array elements and return a new array with the original array unchanged
t3 = t1.astype(np.int32)
t4 = t2.astype(np.complex64)
print(t1.dtype)
print(t2.dtype)
print(t3.dtype)
print(t4.dtype)

Creating an array of ndarray s from a sequence

np.arange()

  • Create an arithmetic sequence with start, end, and step values
  • np.arange(0, 1, 0.1)
  • Note 1 is not in the array!

np.linspace()

  • Create an arithmetic sequence by starting value, ending value and number of elements
  • np.linspace(0, 1, 10)
  • np.linspace(0, 1, 10, endpoint=False)
  • You can specify whether to include the final value through the endpoint parameter. The default value is True, that is, include the final value

np.logspace()

  • Create an isometric sequence by starting value, ending value and number of elements
  • np.logspace(0, 2, 5)
  • np.logspace(0, 1, 12, base=2, endpoint=False)
  • You can change the base number through the base. The default value is 10
  • You can specify whether to include the final value through the endpoint parameter. The default value is True
import numpy as np 
# Create an arithmetic sequence with start, end, and step values
np.arange(0, 1, 0.1) # From 0 to 1, step 0.1. Note that 1 is not in the array!
# Create an arithmetic sequence by starting value, ending value and number of elements
# np.linspace(0, 1, 10) # An arithmetic sequence of 10 elements from 0 to 1
np.linspace(0, 1, 10, endpoint=False) # You can specify whether to include the final value through the endpoint parameter. The default value is True, that is, include the final value
# Create an isometric sequence by starting value, ending value and number of elements
# np.logspace(0, 2, 5) # From 0 to 2, the proportional sequence of 5 elements
np.logspace(0, 1, 12, base=2, endpoint=False) # You can change the base number through the base. The default value is 10
# You can specify whether to include the final value through the endpoint parameter. The default value is True

Creating ndarray with the from series method

From series methods, you can directly create ndarray from three "flow" types

fromstring(): create from string type (str)

frombugffer(): created from byte sequence type (bytes)

fromfile(): create from file type (file)

In addition, there is the fromfunction method, which creates an ndarray from a function

import numpy as np 
# Create ndarray from string
s = '1 2 3 4 5'
a = np.fromstring(s, dtype=float, sep=' ')
print(a)
# Change dtype parameters and sep parameters (flexibly switch various separators)
s = '1, 2, 3, 4, 5'
a = np.fromstring(s, dtype=np.int8, sep=',')
print(a)
# If the string contains characters instead of numbers, the conversion will be problematic
# Incorrect warning prompt type
s = 'abcdefgh'
a = np.fromstring(s, dtype=np.int8)
print(a)
# The warning indicates that the function has been abandoned,
# If you convert a string (a string that really contains English characters) to ndarray,
# It should be done in binary form with frombuffer.
import numpy as np 
# Create ndarray from byte sequence
# You can convert characters into arrays and convert them according to ASCII code
# b is added before the string to represent the string saved in bytes mode (byte sequence)
s = b"abcdefgh"
a = np.frombuffer(s, dtype=np.int8)
print(a)
# The b mode of string can only support ASCII code
s = b"Chinese string"
a = np.frombuffer(s, dtype=np.int8)
print(a)
# fromfile can read text files and binary files. Here we take text files as an example
with open('data.txt','r') as f:
    a = np.fromfile(f)
print(a)
# Reading with the default parameter of fromfile doesn't seem right. What's wrong?
with open('data.txt','r') as f:
    a = np.fromfile(f, dtype=np.int8)
print(a)
# It still seems wrong?
with open('data.txt','r') as f:
    a = np.fromfile(f, dtype=np.int8, sep=' ')
print(a)
# The correct type and separator must be set to correctly read the text file!
# From file is equivalent to using the fromstring or frombuffer method on the file object f.
import numpy as np 
# First, define a function to calculate the value from the subscript
def func1d(i):
    return i % 4 +1
# Then create an ndarray of the specified size with fromfunction, in which each element is calculated by subscript
# Note that the size of the array should be represented by Yuanzu. The ancestor of only one element is written in such a way as (x,). If only one number is written, an error will be reported
a = np.fromfunction(func1d, (10,))
print(a)
# Generate a two-dimensional array from a function
def func2d(i,j):
    return (i+1)*(j+1)
a = np.fromfunction(func2d, (9,9))
print(a)
# Generate 3D array from function
def func3d(i,j,k):
    return (i+1)*(j+1)*(k+1)
a = np.fromfunction(func3d, (2,5,5))
print(a)
# Let's see help
help(np.fromfile)
# help(np.fromstring)
# help(np.frombuffer)
# help(np.fromfunction)

Structure array

  • Similar to the structure array in C language, NumPy also provides structure array to facilitate the processing of structured data.
  • In C language, multiple fields of a structure occupy a continuous memory space, and two structures of the same type occupy the same memory size, so it is easy to define the structure array.
  • In NumPy, the structure array can be implemented in accordance with C language. Array elements are composed of structures, and each structure is composed of multiple fields.
import numpy as np 
# To define a structure, first create an np.dtype object persontype
# Its parameter is a dictionary that describes the fields of the structure type
# The field has two keys: names and formats, and the value corresponding to each key is a list. Names is the field name and formats is the field type.
persontype = np.dtype({
    'names': ['name', 'age', 'weight'],
    'formats': ['S30', 'i', 'f']}, align=True)
# 'S30 'represents a string type with a length of 30 bytes. The length of each element in the structure array must be a fixed value, so the string type must also specify the length
# 'i' represents a 32-bit integer, equivalent to np.int32
# 'f' represents 32-bit single precision floating-point type, equivalent to np.float32

import numpy as np 
# To define a structure, first create an np.dtype object persontype
# Its parameter is a dictionary that describes the fields of the structure type
# The field has two keys: names and formats, and the value corresponding to each key is a list. Names is the field name and formats is the field type.
persontype = np.dtype({
    'names': ['name', 'age', 'weight'],
    'formats': ['S30', 'i', 'f']}, align=True)
# 'S30 'represents a string type with a length of 30 bytes. The length of each element in the structure array must be a fixed value, so the string type must also specify the length
# 'i' represents a 32-bit integer, equivalent to np.int32
# 'f' represents 32-bit single precision floating-point type, equivalent to np.float32
# Create a structure array with np.array(), where each structure element is represented by a primitive
# Pay attention to setting the dtype parameter and locating the persontype
a = np.array([('Zhang',32,75.5),('Wang',24,65.2)], dtype=persontype)
# a = np.array([('Zhang San', 32,75.5),('Wang',24,65.2)], dtype=persontype)
# The above sentence will report an error, because the string in NumPy is still an ASCII string, uses b mode, and does not support multiple languages
# Structure arrays are accessed through subscripts, just like general arrays.
# The elements of the structure array look like tuples, but they are actually predefined structure types (in this case, personality types)
print(a)
print(a[0])
print(a[0].dtype)
# You can use the field name as a subscript to obtain the corresponding field value
print(a[0]['name'])

import numpy as np 
# To define a structure, first create an np.dtype object persontype
persontype = np.dtype({
    'names': ['name', 'age', 'weight'],
    'formats': ['S30', 'i', 'f']}, align=True)
a = np.array([('Zhang',32,75.5),('Wang',24,65.2)], dtype=persontype)
# If you modify the field of a structure element, the corresponding part of the structure array will also be modified
c = a[0]
c['name'] = 'Li'
print(c)
print(a)
# You can directly obtain the fields of the structure array and return the "view" of the original array
# The field view is also an ndarray object. Changes to this view will be reflected in the structure array
b = a['age']
print(b)
print(type(b)) # The field view is actually an ndarray object
b += 5         # Calculate the ndarray
print(b)
print(a)       # The result is reflected on the original structure array
# When a field type is array, the third element of tuple is used to represent the array shape
# You can create a new dtype type in the following form, with the list as the input of the construction method
datatype = np.dtype([('data1','i4'),('data2','f8',(2,3)),('data3','f8',(10,10))])
print(datatype)

Mask array

If some values in the numpy array are missing, or we don't want some values to participate in the following operations, we can cover up part of the array. This is called a mask.

import numpy as np 
a = np.random.normal(size=(3,5))
print(a)
# Shield those less than 0
b = np.ma.masked_where(a < 0, a)
print(b)

Skills in using array subscripts

Subscript method: a[2]

Slicing method:

  • a[3:5] includes a[3] but does not include a[5]
  • a[:5] start with a[0]
  • a[:-1] use negative numbers to represent numbers from back to front
  • a[1:-1:2] the third element represents the step size, one for every two elements
  • a[::-1] the step size is negative, and the order of the whole array is reversed
  • a[5:1:-2] when the step size is negative, the start subscript must be greater than the end subscript

The array obtained by slicing is a "view" of the original array and shares the same storage space with the original array. Therefore, modifying the result array will change the original array

import numpy as np 
a = np.array([1,2,3,4,5,6,7])
a[2]
a[3:5]
a[:5]
a[:-1]
a[1:-1:2]
a[::-1]
a[5:1:-2]

2, ufunc function

ufunc is the abbreviation of universal function. It is a function that operates on each element of an array

Many ufunc functions built in NumPy are implemented in c language, which is very fast

  • x = np.linspace(0, 2*np.pi, 10)
  • y = np.sin(x)
  • t = np.sin(x, out=x)

NumPy's array object supports operations such as addition, subtraction, multiplication and division

Because the addition, subtraction, multiplication and division operation is implemented using ufunc in NumPy, ufunc is actually called

Arithmetic operator / comparison operator of ufunc

Arithmetic operators: addition, subtraction, multiplication, division, power congruence

Comparison operator: greater than or less than or not equal to

Array objects support operators, which greatly facilitates programming. However, it should be noted that if the formula is very complex and the array is large, too many intermediate variables will be generated to reduce the running speed of the program.

Proper consideration can be given to using multiple in-situ operators, such as x += y, and dividing complex formulas into several lines to reduce memory allocation for intermediate variables

import numpy as np 
x1 = np.array([1,2,3,4])
x2 = np.array([5,6,7,8])
y = x1 + x2 # add
print(y)
y = x1 - x2 # subtract
print(y)
y = x1 * x2 # multiply
print(y)
y = x1 / x2 # divide
print(y)
y = x1 // x2 # floor divide
print(y)
y = -x1 # negative
print(y)
y = x1 ** x2 # power
print(y)
y = x1 % x2 # remainder
print(y)
y += x1 # In situ operator
print(y)
y = x1 == x2 # equal
print(y)
y = x1 != x2 # not equal
print(y)
y = x1 < x2 # less
print(y)
y = x1 <= x2 # less_equal
print(y)
y = x1 > x2 # greater
print(y)
y = x1 >= x2 # greater_equal
print(y)

ufunc function velocity measurement

Here are four ways to calculate the speed of sinusoidal function to see who is faster

  • math.sin with python's for loop
  • List derivation of math.sin with python
  • np.sin uses the ufunc function to directly calculate the array
  • np.sin with python's for loop
import numpy as np 
import math
import numpy as np
import copy
x = [i * 0.001 for i in range(1000000)]
def sin_math_loop(x):
    for i, t in enumerate(x):
        x[i] = math.sin(t)
def sin_math_list(x):
    x = [math.sin(t) for t in x]  
def sin_numpy(x):
    np.sin(x, out = x)
    # Since np.sin is a ufunc function, it loops through each element of array x and calculates their sine values respectively.
    # The return value of np.sin is an array that holds the calculation results. This array is newly created. The value of x remains unchanged after the operation.
 # You can specify an array to hold the calculation results by specifying the out parameter. If you want to perform in-situ calculation, you can set out=x.
def sin_numpy_loop(x):
    for i, t in enumerate(x):
        x[i] = np.sin(t)     
# math.sin with python's for loop
# The speed is quite slow
x1 = copy.deepcopy(x)
%time sin_math_loop(x1)
# List derivation of math.sin with python
# Using list derivation can speed up a little, but not much
x2 = copy.deepcopy(x)
%time sin_math_list(x2)
# np.sin uses the ufunc function to directly calculate the array
# The speed is the fastest, more than 10 times faster than the previous two, thanks to numpy's loop at the c language level
x3 = np.array(x)
%time sin_numpy(x3)
# np.sin with python's for loop
# Slowest. It is 100 times more than using the numpy function directly on the numpy array. Explain the necessity of using ufunction.
# np.sin in order to support the calculation of arrays and single values at the same time, its internal implementation is much more complex than math.sin.
x4 = copy.deepcopy(x)
%time sin_numpy_loop(x4)

ufunc functions: Customizing

Use frompyfunc(func, nin, nout)

Where func is a python function, nin is the number of input parameters of func, and nout is the number of return values of func

import numpy as np 
# Custom ufunc
def myfunc(x):
    return x**2 + 1
my_ufunc = np.frompyfunc(myfunc, 1, 1)
x = np.linspace(0,10,11)
%timeit y = my_ufunc(x)
%timeit y = myfunc(x)
y = my_ufunc(x)
print(y)
import numpy as np 
# Custom ufunc
def triangle_wave(x, c, c0, hc):
    x = x - int(x) # The period of triangular wave is 1, so only the decimal part of x coordinate is taken for calculation
    if x >= c: r = 0.0
    elif x < c0: r = x / c0 * hc
    else: r = (c-x) / (c-c0) * hc
    return r
x = np.linspace(0, 2, 1000)
y1 = np.array([triangle_wave(t, 0.6, 0.4, 1.0) for t in x])
triangle_ufunc1 = np.frompyfunc(triangle_wave, 4, 1)
y2 = triangle_ufunc1(x, 0.6, 0.4, 1.0)
%C y2.dtype; y2.astype(np.float).dtype
triangle_ufunc2 = np.vectorize(triangle_wave, otypes=[np.float])
y3 = triangle_ufunc2(x, 0.6, 0.4, 1.0)

radio broadcast

If the ufunc input parameter has multiple arrays with different shapes, the broadcast operation will be performed automatically

  • Let all input arrays align with the array with the largest dimension, and the insufficient part of the shape attribute is supplemented by adding 1 in front
  • The shape property of the output array is the maximum value on each axis of the shape property of the input array
  • If the length of an axis of the input array is 1 or the same as the length of the corresponding axis of the output array, this array can be used for calculation, otherwise an error will occur
  • When the length of an axis of the input array is 1 inch, the first set of values on this axis will be used when operating along this axis!
import numpy as np 
a = np.arange(0, 60, 10).reshape(-1,1)
print(a)
print(a.shape)
b = np.arange(0, 5)
print(b)
print(b.shape)
# Calculate the sum of a+b to get an addition table. See how numpy broadcasts.
c = a + b
print(c)
print(c.shape)
# According to rule 1, the shape dimension of b differs by 1, and 1 dimension is added in front
# According to rule 2, the length of the output array is the maximum length of each axis of the input array
# The above broadcast process is equivalent to extending a and b
# It can be extended with the displayed by repeat
a = a.repeat(5, axis = 1)
print(a)
print(a.shape)
b.shape = (1,5)
b = b.repeat(6, axis = 0)
print(b)
print(b.shape)
# This broadcast method is very common. numpy provides an ogrid object to create a pairable array for broadcast operation
x, y = np.ogrid[:5,:5]
print(x)
print(y)
# numpy also provides the mgrid object, which returns the array after broadcasting
x, y = np.mgrid[:5,:5]
print(x)
print(y)

3, Subscript access of multidimensional array

Skills in using array subscripts

  • For multidimensional arrays, tuples are used as array subscripts, separated by commas
  • To avoid problems, use tuples "explicitly" as subscripts
  • Integer tuple / list / array, Boolean array as subscript
  • In the subscript tuples of multidimensional arrays, integer tuples or lists, integer arrays and Boolean arrays can also be used
  • When these objects are used in the subscript, the data obtained is a copy of the original data, so modifying the result array will not change the original array.
  • If the subscript tuple contains only integers and slices, the resulting array shares data with the original array, which is the view of the original array
  • If the subscript tuple also contains integer tuple or list, integer array and Boolean array, the obtained data is a copy of the original data and does not share data with the original array
import numpy as np 
x = np.arange(5,0,-1)
a = x[np.array([True, False, True, False, False])]
b = x[x>2]
c = x[[True, False, True, False, False]]
print('x = ', x)
print('a = ', a)
print('b = ', b)
print('c = ', c)
# The array obtained by using the list as the subscript does not share memory space with the original array, and its changes are not reflected on the original array
b[0] = 2233
print('b = ', b)
print('x = ', x)
print('a = ', a)
import numpy as np
from mayavi import mlab
x, y, z = np.mgrid[:6,:7,:8]
c = np.zeros((6, 7, 8), dtype=np.int)
c.fill(1)
k = np.random.randint(2,5,size=(6, 7))
idx_i, idx_j, _ = np.ogrid[:6, :7, :8]
idx_k = k[:,:, np.newaxis] + np.arange(3)
c[idx_i, idx_j, idx_k] = np.random.randint(2,6, size=(6,7,3))
mlab.points3d(x[c>1], y[c>1], z[c>1], c[c>1], mode="cube", scale_factor=0.8, 
    scale_mode="none", transparent=True, vmin=0, vmax=8, colormap="Blues")
mlab.points3d(x[c==1], y[c==1], z[c==1], c[c==1], mode="cube", scale_factor=0.8,
    scale_mode="none", transparent=True, vmin=0, vmax=8, colormap="Vega20", opacity = 0.2)
mlab.gcf().scene.background = (1,1,1)
mlab.figure()
x, y, z = np.mgrid[:6,:7,:3]
mlab.points3d(x, y, z, c[idx_i, idx_j, idx_k], mode="cube", scale_factor=0.8, 
    scale_mode="none", transparent=True, vmin=0, vmax=8, colormap="Purples", opacity = 1)
mlab.gcf().scene.background = (1,1,1)
mlab.show()

4, NumPy file read / write

NumPy provides a series of simple methods to save or load ndarray objects directly to or from a file.

These methods are mainly introduced below:

save,savez and load

savetxt and loadtext

tofile

import numpy as np 
# Randomly generated ndarray
a = np.random.random((5,5))
b = np.random.normal(size=(5,5))
print(a)
print(b)
# Use save to save a single ndarray object in binary form
# Note that the extension of the file saved by save is npy
# You can also omit the extension and add it automatically when saving
np.save('a.npy',a)
np.save('b.npy',b)
# Use load to load the ndarray object you just saved
# When reading a file, the extension npy must be written
af = np.load('a.npy')
bf = np.load('b.npy')
print(af)
print(bf)
import numpy as np 
# Randomly generated ndarray
a = np.random.random((5,5))
b = np.random.normal(size=(5,5))
print(a)
print(b)
# Packaging multiple ndarray objects with savez
# The packed extension is npz, which is actually a compressed package composed of multiple Npys
np.savez('ab.npz',a,b)
# Load compressed packages using load
zf = np.load('ab.npz')
# Check the names of each array and find that the arrays are automatically named in order
print(zf.files)
# Unpack the original ndarray from the packaged object
print(zf['arr_0'])
print(zf['arr_1'])
# When using savez, you can use keyword parameters to name each ndarray object, so that you won't get confused
np.savez('ab.npz',a=a,b=b)
zf = np.load('ab.npz')
print(zf.files)
print(zf['a'])
print(zf['b'])
# Using savetxt, you can output the ndarray object as a text file
# But it only supports 1D or 2D ndarray
# The file extension can be set freely, and savetxt will not be added to you automatically. Generally set to txt.
np.savetxt('a.txt', a, fmt='%10.8f', delimiter=' ', header='a0 a1 a2 a3 a4', comments='#')
# The savetxt function has many parameters and can implement complex formatted output
# help(np.savetxt)

import numpy as np 
# Randomly generated ndarray
a = np.random.random((5,5))
b = np.random.normal(size=(5,5))
# Use loadtext to read the ndarray object from a text file
af = np.loadtxt('a.txt')
print(af)
# The loadtext function also has many parameters and can read data from a variety of text files, such as csv files
# help(np.loadtxt)
# Each ndarray object has a tofile method that can be quickly output as a text file or binary file
# However, the tofile method has few options
# Set the sep parameter to a non empty string, and tofile will be output as a text file
a.tofile('a.txt',sep=',',format='%10.8f')
# You can use fromfile to read. You should pay attention to setting the correct sep parameters
# The dimension information of the read array has not been saved and is lost
af = np.fromfile('a.txt',sep=',')
print(af)
# Use tofile to save binary files, which is the default mode of tofile
a.tofile('a.bin')
# Using fromfile can also read binary files, and the dimension information of the array is lost
af = np.fromfile('a.bin')
print(af)

Keywords: Python Back-end crawler

Added by happyness on Mon, 08 Nov 2021 06:35:35 +0200