NumPy: data type object dtype

Original address:

https://www.cnblogs.com/flydean/p/14720858.html

=====================================================

brief introduction

As mentioned earlier, there are many data types in NumPy, and each data type is a dtype(numpy.dtype) object. Today, let's talk about dtype objects in detail.

Definition of dtype

Let's first look at the definition of dtype method:

class numpy.dtype(obj, align=False, copy=False)

Its function is to convert object obj into dtype type object.

It takes two optional parameters:

align - whether objects are aligned according to the structure output format of the C compiler.
Copy - whether to copy or reference an object.

dtype can be used to describe the type of data (int, float, Python object, etc.), the size of data, the byte order of data (small end or large end), etc.

Objects that can be converted to dtype

There are many types of convertible obj objects. Let's explain them later

dtype object

If the obj object itself is a dtype object, it can be transformed seamlessly.

None

If not, the default is float_, This is why we create arrays that are all of float type by default.

Array scalar type

The built-in array scalar can be converted into a related data type object.

In the previous article, we talked about what array scalar types are. Array scalar types are data types that can be accessed through np.type. For example: np.int32, np.complex128, etc.

Let's look at the conversion of array scalars:

In [85]: np.dtype(np.int32)
Out[85]: dtype('int32')

In [86]: np.dtype(np.complex128)
Out[86]: dtype('complex128')

For these built-in array scalar types starting with np, please refer to my previous article "NumPy: data types".

Note that array scalars are not dtype objects, although in many cases, array scalars can be used when dtype objects need to be used.

General type

Some common type objects can be converted into corresponding dtype types:

Generic type object	dtype type
number, inexact, floating	float
complexfloating	cfloat
integer, signedinteger	int_
unsignedinteger	uint
character	string
generic, flexible	void

Built in Python type

Some Python built-in types are equivalent to array scalar types, and can also be converted to dtype:

Python type	dtype type
int	int_
bool	bool_
float	float_
complex	cfloat
bytes	bytes_
str	str_
buffer	void
(all others)	object_

Here is an example of built-in Python type conversion:

In [82]: np.dtype(float)
Out[82]: dtype('float64')

In [83]: np.dtype(int)
Out[83]: dtype('int64')

In [84]:  np.dtype(object)
Out[84]: dtype('O')

Object with. dtype attribute

Any type object can be converted to dtype as long as it contains the dtype attribute and the attribute belongs to the convertible range.

A string object of one character

For each built-in data type, there is a corresponding character code. We can also use these character codes for conversion:

In [134]: np.dtype('b')  # byte, native byte order
Out[134]: dtype('int8')

In [135]: np.dtype('>H')  # big-endian unsigned short
Out[135]: dtype('>u2')

In [136]: np.dtype('<f') # little-endian single-precision float
Out[136]: dtype('float32')

In [137]: np.dtype('d') # double-precision floating-point number
Out[137]: dtype('float64')

String of array type

Objects of array type in Numpy have a property called typestr.

typestr describes the data type and length stored in this array.

typestr consists of three parts. The first part describes the data byte order: < Small end > Big end.

The second part is the basic types of elements in the array:

type	describe
t	Bit field (following integer gives the number of bits in the bit field).
b	Boolean (integer type where all values are only True or False)
i	Integer
u	Unsigned integer
f	Floating point
c	Complex floating point
m	Timedelta
M	Datetime
O	Object (i.e. the memory contains a pointer to PyObject)
S	String (fixed-length sequence of char)
U	Unicode (fixed-length sequence of Py_UNICODE)
V	Other (void * – each item is a fixed-size chunk of memory)

The last part is the length of the data.

dtype supports the following types of transformations:

type	describe
'?'	boolean
'b'	(signed) byte
'B'	unsigned byte
'i'	(signed) integer
'u'	unsigned integer
'f'	floating-point
'c'	complex-floating point
'm'	timedelta
'M'	datetime
'O'	(Python) objects
'S', 'a'	zero-terminated bytes (not recommended)
'U'	Unicode string
'V'	raw data (void)

Let's take a few examples:

In [137]: np.dtype('d')
Out[137]: dtype('float64')

In [138]: np.dtype('i4')
Out[138]: dtype('int32')

In [139]: np.dtype('f8')
Out[139]: dtype('float64')

In [140]:  np.dtype('c16')
Out[140]: dtype('complex128')

In [141]: np.dtype('a25')
Out[141]: dtype('S25')

In [142]: np.dtype('U25')
Out[142]: dtype('<U25')

Comma separated string

Comma separated strings can be used to represent structured data types.

This structured data type can also be converted into dtpye format. The converted dtype will save the corresponding format data with f1, f2,... fn-1 as the name. Let's take an example:

In [143]: np.dtype("i4, (2,3)f8, f4")
Out[143]: dtype([('f0', '<i4'), ('f1', '<f8', (2, 3)), ('f2', '<f4')])

In the above example, f0 stores 32-bit integers and f1 stores 64 bit floating-point numbers of 2 x 3 arrays. f2 is a 32-bit floating point number.

Take another example:

In [144]: np.dtype("a3, 3u8, (3,4)a10")
Out[144]: dtype([('f0', 'S3'), ('f1', '<u8', (3,)), ('f2', 'S10', (3, 4))])

Type string

All characters in numpy.sctypeDict.keys() can be converted to dtype:

In [146]: np.sctypeDict.keys()
Out[146]: dict_keys(['?', 0, 'byte', 'b', 1, 'ubyte', 'B', 2, 'short', 'h', 3, 'ushort', 'H', 4, 'i', 5, 'uint', 'I', 6, 'intp', 'p', 7, 'uintp', 'P', 8, 'long', 'l', 'L', 'longlong', 'q', 9, 'ulonglong', 'Q', 10, 'half', 'e', 23, 'f', 11, 'double', 'd', 12, 'longdouble', 'g', 13, 'cfloat', 'F', 14, 'cdouble', 'D', 15, 'clongdouble', 'G', 16, 'O', 17, 'S', 18, 'unicode', 'U', 19, 'void', 'V', 20, 'M', 21, 'm', 22, 'bool8', 'Bool', 'b1', 'float16', 'Float16', 'f2', 'float32', 'Float32', 'f4', 'float64', 'Float64', 'f8', 'float128', 'Float128', 'f16', 'complex64', 'Complex32', 'c8', 'complex128', 'Complex64', 'c16', 'complex256', 'Complex128', 'c32', 'object0', 'Object0', 'bytes0', 'Bytes0', 'str0', 'Str0', 'void0', 'Void0', 'datetime64', 'Datetime64', 'M8', 'timedelta64', 'Timedelta64', 'm8', 'int64', 'uint64', 'Int64', 'UInt64', 'i8', 'u8', 'int32', 'uint32', 'Int32', 'UInt32', 'i4', 'u4', 'int16', 'uint16', 'Int16', 'UInt16', 'i2', 'u2', 'int8', 'uint8', 'Int8', 'UInt8', 'i1', 'u1', 'complex_', 'int0', 'uint0', 'single', 'csingle', 'singlecomplex', 'float_', 'intc', 'uintc', 'int_', 'longfloat', 'clongfloat', 'longcomplex', 'bool_', 'unicode_', 'object_', 'bytes_', 'str_', 'string_', 'int', 'float', 'complex', 'bool', 'object', 'str', 'bytes', 'a'])

Examples used:

In [147]: np.dtype('uint32')
Out[147]: dtype('uint32')

In [148]: np.dtype('float64')
Out[148]: dtype('float64')

tuple

By using the tuple formed by dtype, we can generate a new dtype.

Tuples also come in many ways.

(flexible_dtype, itemsize)

For non fixed length dtype s, you can specify size:

In [149]: np.dtype((np.void, 10))
Out[149]: dtype('V10')

In [150]: np.dtype(('U', 10))
Out[150]: dtype('<U10')

(fixed_dtype, shape)

For fixed length dtype s, you can specify shape:

In [151]:  np.dtype((np.int32, (2,2)))
Out[151]: dtype(('<i4', (2, 2)))

In [152]: np.dtype(('i4, (2,3)f8, f4', (2,3)))
Out[152]: dtype(([('f0', '<i4'), ('f1', '<f8', (2, 3)), ('f2', '<f4')], (2, 3)))

[(field_name, field_dtype, field_shape), ...]

The elements in the list are fields one by one. Each field is composed of 2-3 parts: field name, field type and field shape.

field_ If name is' ', the default f1, f2... Will be used as the name. field_name It can also be a 2-tuple, consisting of title and name.

field_dtype Is the dtype type of field.

Shape is an optional field, if field_ If dtype is an array, you need to specify shape.

In [153]: np.dtype([('big', '>i4'), ('little', '<i4')])
Out[153]: dtype([('big', '>i4'), ('little', '<i4')])

There are two fields above. One is the 32-bit int of the large end and the other is the 32-bit int of the small end.

In [154]: np.dtype([('R','u1'), ('G','u1'), ('B','u1'), ('A','u1')])
Out[154]: dtype([('R', 'u1'), ('G', 'u1'), ('B', 'u1'), ('A', 'u1')])

Four fields, each unsigned.

{'names': ..., 'formats': ..., 'offsets': ..., 'titles': ..., 'itemsize': ...}

In this way, you can specify the name list and formats list:

In [157]: np.dtype({'names': ['r','g','b','a'], 'formats': [np.uint8, np.uint8, np.uint8, np.uint8]})
Out[157]: dtype([('r', 'u1'), ('g', 'u1'), ('b', 'u1'), ('a', 'u1')])

offsets It refers to byte offsets of each field. Title is the title of the field, itemsize Is the size of the entire dtype.

In [158]: np.dtype({'names': ['r','b'], 'formats': ['u1', 'u1'],
     ...:                'offsets': [0, 2],
     ...:                'titles': ['Red pixel', 'Blue pixel']})
     ...:
Out[158]: dtype({'names':['r','b'], 'formats':['u1','u1'], 'offsets':[0,2], 'titles':['Red pixel','Blue pixel'], 'itemsize':3})

(base_dtype, new_dtype)

You can convert basic dtype types to structured dtype types:

In [159]: np.dtype((np.int32,{'real':(np.int16, 0),'imag':(np.int16, 2)}))
Out[159]: dtype([('real', '<i2'), ('imag', '<i2')])

A 32-bit int is converted to two 16 bit ints.

In [161]: np.dtype(('i4', [('r','u1'),('g','u1'),('b','u1'),('a','u1')]))
Out[161]: dtype([('r', 'u1'), ('g', 'u1'), ('b', 'u1'), ('a', 'u1')])

32-bit int, converted to 4 unsigned integers.

==============================================

Added by acabrera on Sun, 28 Nov 2021 01:36:49 +0200

Programming VIP