NumPy: data type

brief introduction

We know that there are four types of numbers in Python, namely int, float, bool and complex. As a scientific computing NumPy, its data types are more abundant.

Today, I will explain the data types in NumPy in detail.

Data type in array

NumPy is implemented in C language. We can compare the data types in the array in NumPy with those in C language:

Types in Numpy	Type in C	explain
np.bool_	bool	Boolean (True or False) stored as a byte
np.byte	signed char	Platform-defined
np.ubyte	unsigned char	Platform-defined
np.short	short	Platform-defined
np.ushort	unsigned short	Platform-defined
np.intc	int	Platform-defined
np.uintc	unsigned int	Platform-defined
np.int_	long	Platform-defined
np.uint	unsigned long	Platform-defined
np.longlong	long long	Platform-defined
np.ulonglong	unsigned long long	Platform-defined
np.half / np.float16		Half precision float: sign bit, 5 bits exponent, 10 bits mantissa
np.single	float	Platform-defined single precision float: typically sign bit, 8 bits exponent, 23 bits mantissa
np.double	double	Platform-defined double precision float: typically sign bit, 11 bits exponent, 52 bits mantissa.
np.longdouble	long double	Platform-defined extended-precision float
np.csingle	float complex	Complex number, represented by two single-precision floats (real and imaginary components)
np.cdouble	double complex	Complex number, represented by two double-precision floats (real and imaginary components).
np.clongdouble	long double complex	Complex number, represented by two extended-precision floats (real and imaginary components).

Let's randomly check what the above types are in the Ipython environment:

import numpy as np

In [26]: np.byte
Out[26]: numpy.int8

In [27]: np.bool_
Out[27]: numpy.bool_

In [28]: np.ubyte
Out[28]: numpy.uint8

In [29]: np.short
Out[29]: numpy.int16

In [30]: np.ushort
Out[30]: numpy.uint16

Therefore, the underlying data types of the above data types are still fixed length data types. Let's see what they are:

Numpy type	Type C	explain
np.int8	int8_t	Byte (-128 to 127)
np.int16	int16_t	Integer (-32768 to 32767)
np.int32	int32_t	Integer (-2147483648 to 2147483647)
np.int64	int64_t	Integer (-9223372036854775808 to 9223372036854775807)
np.uint8	uint8_t	Unsigned integer (0 to 255)
np.uint16	uint16_t	Unsigned integer (0 to 65535)
np.uint32	uint32_t	Unsigned integer (0 to 4294967295)
np.uint64	uint64_t	Unsigned integer (0 to 18446744073709551615)
np.intp	intptr_t	Integer used for indexing, typically the same as ssize_t
np.uintp	uintptr_t	Integer large enough to hold a pointer
np.float32	float
np.float64 / np.float_	double	Note that this matches the precision of the builtin python float.
np.complex64	float complex	Complex number, represented by two 32-bit floats (real and imaginary components)
np.complex128 / np.complex_	double complex	Note that this matches the precision of the builtin python complex.

All of these types are instances of dtype objects. There are five basic types commonly used, namely bool, int, uint, float and complex.

The number after the type indicates the number of bytes occupied by the type.

There are some platform defined data types in the above table. These types are platform related and should be paid special attention to when using.

These dtype types can be manually specified when creating an array:

>>> import numpy as np
>>> x = np.float32(1.0)
>>> x
1.0
>>> y = np.int_([1,2,4])
>>> y
array([1, 2, 4])
>>> z = np.arange(3, dtype=np.uint8)
>>> z
array([0, 1, 2], dtype=uint8)

For historical reasons, for downward compatibility, we can also specify the dtype of character format when creating the array.

>>> np.array([1, 2, 3], dtype='f')
array([ 1.,  2.,  3.], dtype=float32)

The f above represents the float type.

Type conversion

If you want to convert an existing array type, you can use the astype method provided by the array or call the np cast method:

In [33]: z = np.arange(3, dtype=np.uint8)

In [34]: z
Out[34]: array([0, 1, 2], dtype=uint8)

In [35]: z.astype(float)
Out[35]: array([0., 1., 2.])

In [36]: np.int8(z)
Out[36]: array([0, 1, 2], dtype=int8)

Note that we used float above, and Python will automatically replace float with NP float_， The same simplified format is int = = NP int_, bool == np. bool_, complex == np. complex_. Other data types cannot use the simplified version.

View type

To view the data type of an array, you can use the built-in dtype attribute:

In [37]: z.dtype
Out[37]: dtype('uint8')

Some operations can also be performed as dtype itself:

>>> d = np.dtype(int)
>>> d
dtype('int32')

>>> np.issubdtype(d, np.integer)
True

>>> np.issubdtype(d, np.floating)
False

data overflow

Generally speaking, if it exceeds the range of data, an exception will be reported. For example, we have a very long int value:

In [38]: a= 1000000000000000000000000000000000000000000000000000000000000000000000000000000

In [39]: a
Out[39]: 1000000000000000000000000000000000000000000000000000000000000000000000000000000

In [40]: np.int(1000000000000000000000000000000000000000000000000000000)
Out[40]: 1000000000000000000000000000000000000000000000000000000

In [41]: np.int32(1000000000000000000000000000000000000000000000000000000)
---------------------------------------------------------------------------
OverflowError                             Traceback (most recent call last)
<ipython-input-41-71feb4433730> in <module>()
----> 1 np.int32(1000000000000000000000000000000000000000000000000000000)

If the above number is too long and exceeds the range of int32, an exception will be thrown.

However, if some operations of NumPy exceed the range, they will not report exceptions, but the normal range. At this time, we need to pay attention to:

In [43]: np.power(100, 8, dtype=np.int32)
Out[43]: 1874919424

In [44]: np.power(100, 8, dtype=np.int64)
Out[44]: 10000000000000000

NumPy provides two methods to measure the range of int and float, NumPy Iinfo and NumPy finfo :

In [45]:  np.iinfo(int)
Out[45]: iinfo(min=-9223372036854775808, max=9223372036854775807, dtype=int64)

In [46]: np.iinfo(np.int32)
Out[46]: iinfo(min=-2147483648, max=2147483647, dtype=int32)

In [47]: np.iinfo(np.int64)
Out[47]: iinfo(min=-9223372036854775808, max=9223372036854775807, dtype=int64)

If the 64 bit int is still too small, you can use NP Float64, float64 can use scientific counting method, so it can get a wider range of results, but its accuracy may be reduced.

In [48]: np.power(100, 100, dtype=np.int64)
Out[48]: 0

In [49]: np.power(100, 100, dtype=np.float64)
Out[49]: 1e+200

This article has been included in http://www.flydean.com/02-python-numpy-datatype/

The most popular interpretation, the most profound dry goods, the most concise tutorial, and many tips you don't know are waiting for you to find!

Welcome to my official account: "those things in procedure", understand technology, know you better!

Keywords: Python Programming Data Analysis numpy

Added by medar on Sat, 19 Feb 2022 16:23:16 +0200

Programming VIP