Original address:
https://www.cnblogs.com/flydean/p/14720858.html
=====================================================
brief introduction
As mentioned earlier, there are many data types in NumPy, and each data type is a dtype(numpy.dtype) object. Today, let's talk about dtype objects in detail.
Definition of dtype
Let's first look at the definition of dtype method:
class numpy.dtype(obj, align=False, copy=False)
Its function is to convert object obj into dtype type object.
It takes two optional parameters:
-
align - whether objects are aligned according to the structure output format of the C compiler.
-
Copy - whether to copy or reference an object.
dtype can be used to describe the type of data (int, float, Python object, etc.), the size of data, the byte order of data (small end or large end), etc.
Objects that can be converted to dtype
There are many types of convertible obj objects. Let's explain them later
dtype object
If the obj object itself is a dtype object, it can be transformed seamlessly.
None
If not, the default is float_, This is why we create arrays that are all of float type by default.
Array scalar type
The built-in array scalar can be converted into a related data type object.
In the previous article, we talked about what array scalar types are. Array scalar types are data types that can be accessed through np.type. For example: np.int32, np.complex128, etc.
Let's look at the conversion of array scalars:
In [85]: np.dtype(np.int32) Out[85]: dtype('int32') In [86]: np.dtype(np.complex128) Out[86]: dtype('complex128')
For these built-in array scalar types starting with np, please refer to my previous article "NumPy: data types".
Note that array scalars are not dtype objects, although in many cases, array scalars can be used when dtype objects need to be used.
General type
Some common type objects can be converted into corresponding dtype types:
Generic type object | dtype type |
---|---|
number, inexact, floating | float |
complexfloating | cfloat |
integer, signedinteger | int_ |
unsignedinteger | uint |
character | string |
generic, flexible | void |
Built in Python type
Some Python built-in types are equivalent to array scalar types, and can also be converted to dtype:
Python type | dtype type |
---|---|
int | int_ |
bool | bool_ |
float | float_ |
complex | cfloat |
bytes | bytes_ |
str | str_ |
buffer | void |
(all others) | object_ |
Here is an example of built-in Python type conversion:
In [82]: np.dtype(float) Out[82]: dtype('float64') In [83]: np.dtype(int) Out[83]: dtype('int64') In [84]: np.dtype(object) Out[84]: dtype('O')
Object with. dtype attribute
Any type object can be converted to dtype as long as it contains the dtype attribute and the attribute belongs to the convertible range.
A string object of one character
For each built-in data type, there is a corresponding character code. We can also use these character codes for conversion:
In [134]: np.dtype('b') # byte, native byte order Out[134]: dtype('int8') In [135]: np.dtype('>H') # big-endian unsigned short Out[135]: dtype('>u2') In [136]: np.dtype('<f') # little-endian single-precision float Out[136]: dtype('float32') In [137]: np.dtype('d') # double-precision floating-point number Out[137]: dtype('float64')
String of array type
Objects of array type in Numpy have a property called typestr.
typestr describes the data type and length stored in this array.
typestr consists of three parts. The first part describes the data byte order: < Small end > Big end.
The second part is the basic types of elements in the array:
type | describe |
---|---|
t | Bit field (following integer gives the number of bits in the bit field). |
b | Boolean (integer type where all values are only True or False) |
i | Integer |
u | Unsigned integer |
f | Floating point |
c | Complex floating point |
m | Timedelta |
M | Datetime |
O | Object (i.e. the memory contains a pointer to PyObject) |
S | String (fixed-length sequence of char) |
U | Unicode (fixed-length sequence of Py_UNICODE) |
V | Other (void * – each item is a fixed-size chunk of memory) |
The last part is the length of the data.
dtype supports the following types of transformations:
type | describe |
---|---|
'?' | boolean |
'b' | (signed) byte |
'B' | unsigned byte |
'i' | (signed) integer |
'u' | unsigned integer |
'f' | floating-point |
'c' | complex-floating point |
'm' | timedelta |
'M' | datetime |
'O' | (Python) objects |
'S', 'a' | zero-terminated bytes (not recommended) |
'U' | Unicode string |
'V' | raw data (void) |
Let's take a few examples:
In [137]: np.dtype('d') Out[137]: dtype('float64') In [138]: np.dtype('i4') Out[138]: dtype('int32') In [139]: np.dtype('f8') Out[139]: dtype('float64') In [140]: np.dtype('c16') Out[140]: dtype('complex128') In [141]: np.dtype('a25') Out[141]: dtype('S25') In [142]: np.dtype('U25') Out[142]: dtype('<U25')
Comma separated string
Comma separated strings can be used to represent structured data types.
This structured data type can also be converted into dtpye format. The converted dtype will save the corresponding format data with f1, f2,... fn-1 as the name. Let's take an example:
In [143]: np.dtype("i4, (2,3)f8, f4") Out[143]: dtype([('f0', '<i4'), ('f1', '<f8', (2, 3)), ('f2', '<f4')])
In the above example, f0 stores 32-bit integers and f1 stores 64 bit floating-point numbers of 2 x 3 arrays. f2 is a 32-bit floating point number.
Take another example:
In [144]: np.dtype("a3, 3u8, (3,4)a10") Out[144]: dtype([('f0', 'S3'), ('f1', '<u8', (3,)), ('f2', 'S10', (3, 4))])
Type string
All characters in numpy.sctypeDict.keys() can be converted to dtype:
In [146]: np.sctypeDict.keys() Out[146]: dict_keys(['?', 0, 'byte', 'b', 1, 'ubyte', 'B', 2, 'short', 'h', 3, 'ushort', 'H', 4, 'i', 5, 'uint', 'I', 6, 'intp', 'p', 7, 'uintp', 'P', 8, 'long', 'l', 'L', 'longlong', 'q', 9, 'ulonglong', 'Q', 10, 'half', 'e', 23, 'f', 11, 'double', 'd', 12, 'longdouble', 'g', 13, 'cfloat', 'F', 14, 'cdouble', 'D', 15, 'clongdouble', 'G', 16, 'O', 17, 'S', 18, 'unicode', 'U', 19, 'void', 'V', 20, 'M', 21, 'm', 22, 'bool8', 'Bool', 'b1', 'float16', 'Float16', 'f2', 'float32', 'Float32', 'f4', 'float64', 'Float64', 'f8', 'float128', 'Float128', 'f16', 'complex64', 'Complex32', 'c8', 'complex128', 'Complex64', 'c16', 'complex256', 'Complex128', 'c32', 'object0', 'Object0', 'bytes0', 'Bytes0', 'str0', 'Str0', 'void0', 'Void0', 'datetime64', 'Datetime64', 'M8', 'timedelta64', 'Timedelta64', 'm8', 'int64', 'uint64', 'Int64', 'UInt64', 'i8', 'u8', 'int32', 'uint32', 'Int32', 'UInt32', 'i4', 'u4', 'int16', 'uint16', 'Int16', 'UInt16', 'i2', 'u2', 'int8', 'uint8', 'Int8', 'UInt8', 'i1', 'u1', 'complex_', 'int0', 'uint0', 'single', 'csingle', 'singlecomplex', 'float_', 'intc', 'uintc', 'int_', 'longfloat', 'clongfloat', 'longcomplex', 'bool_', 'unicode_', 'object_', 'bytes_', 'str_', 'string_', 'int', 'float', 'complex', 'bool', 'object', 'str', 'bytes', 'a'])
Examples used:
In [147]: np.dtype('uint32') Out[147]: dtype('uint32') In [148]: np.dtype('float64') Out[148]: dtype('float64')
tuple
By using the tuple formed by dtype, we can generate a new dtype.
Tuples also come in many ways.
(flexible_dtype, itemsize)
For non fixed length dtype s, you can specify size:
In [149]: np.dtype((np.void, 10)) Out[149]: dtype('V10') In [150]: np.dtype(('U', 10)) Out[150]: dtype('<U10')
(fixed_dtype, shape)
For fixed length dtype s, you can specify shape:
In [151]: np.dtype((np.int32, (2,2))) Out[151]: dtype(('<i4', (2, 2))) In [152]: np.dtype(('i4, (2,3)f8, f4', (2,3))) Out[152]: dtype(([('f0', '<i4'), ('f1', '<f8', (2, 3)), ('f2', '<f4')], (2, 3)))
[(field_name, field_dtype, field_shape), ...]
The elements in the list are fields one by one. Each field is composed of 2-3 parts: field name, field type and field shape.
field_ If name is' ', the default f1, f2... Will be used as the name. field_name It can also be a 2-tuple, consisting of title and name.
field_dtype Is the dtype type of field.
Shape is an optional field, if field_ If dtype is an array, you need to specify shape.
In [153]: np.dtype([('big', '>i4'), ('little', '<i4')]) Out[153]: dtype([('big', '>i4'), ('little', '<i4')])
There are two fields above. One is the 32-bit int of the large end and the other is the 32-bit int of the small end.
In [154]: np.dtype([('R','u1'), ('G','u1'), ('B','u1'), ('A','u1')]) Out[154]: dtype([('R', 'u1'), ('G', 'u1'), ('B', 'u1'), ('A', 'u1')])
Four fields, each unsigned.
{'names': ..., 'formats': ..., 'offsets': ..., 'titles': ..., 'itemsize': ...}
In this way, you can specify the name list and formats list:
In [157]: np.dtype({'names': ['r','g','b','a'], 'formats': [np.uint8, np.uint8, np.uint8, np.uint8]}) Out[157]: dtype([('r', 'u1'), ('g', 'u1'), ('b', 'u1'), ('a', 'u1')])
offsets It refers to byte offsets of each field. Title is the title of the field, itemsize Is the size of the entire dtype.
In [158]: np.dtype({'names': ['r','b'], 'formats': ['u1', 'u1'], ...: 'offsets': [0, 2], ...: 'titles': ['Red pixel', 'Blue pixel']}) ...: Out[158]: dtype({'names':['r','b'], 'formats':['u1','u1'], 'offsets':[0,2], 'titles':['Red pixel','Blue pixel'], 'itemsize':3})
(base_dtype, new_dtype)
You can convert basic dtype types to structured dtype types:
In [159]: np.dtype((np.int32,{'real':(np.int16, 0),'imag':(np.int16, 2)})) Out[159]: dtype([('real', '<i2'), ('imag', '<i2')])
A 32-bit int is converted to two 16 bit ints.
In [161]: np.dtype(('i4', [('r','u1'),('g','u1'),('b','u1'),('a','u1')])) Out[161]: dtype([('r', 'u1'), ('g', 'u1'), ('b', 'u1'), ('a', 'u1')])
32-bit int, converted to 4 unsigned integers.
==============================================