NumPy: data type

brief introduction

We know that there are four types of numbers in Python, namely int, float, bool and complex. As a scientific computing NumPy, its data types are more abundant.

Today, I will explain the data types in NumPy in detail.

Data type in array

NumPy is implemented in C language. We can compare the data types in the array in NumPy with those in C language:

Types in NumpyType in Cexplain
np.bool_boolBoolean (True or False) stored as a byte
np.bytesigned charPlatform-defined
np.ubyteunsigned charPlatform-defined
np.shortshortPlatform-defined
np.ushortunsigned shortPlatform-defined
np.intcintPlatform-defined
np.uintcunsigned intPlatform-defined
np.int_longPlatform-defined
np.uintunsigned longPlatform-defined
np.longlonglong longPlatform-defined
np.ulonglongunsigned long longPlatform-defined
np.half / np.float16Half precision float: sign bit, 5 bits exponent, 10 bits mantissa
np.singlefloatPlatform-defined single precision float: typically sign bit, 8 bits exponent, 23 bits mantissa
np.doubledoublePlatform-defined double precision float: typically sign bit, 11 bits exponent, 52 bits mantissa.
np.longdoublelong doublePlatform-defined extended-precision float
np.csinglefloat complexComplex number, represented by two single-precision floats (real and imaginary components)
np.cdoubledouble complexComplex number, represented by two double-precision floats (real and imaginary components).
np.clongdoublelong double complexComplex number, represented by two extended-precision floats (real and imaginary components).

Let's randomly check what the above types are in the Ipython environment:

import numpy as np

In [26]: np.byte
Out[26]: numpy.int8

In [27]: np.bool_
Out[27]: numpy.bool_

In [28]: np.ubyte
Out[28]: numpy.uint8

In [29]: np.short
Out[29]: numpy.int16

In [30]: np.ushort
Out[30]: numpy.uint16

Therefore, the underlying data types of the above data types are still fixed length data types. Let's see what they are:

Numpy typeType Cexplain
np.int8int8_tByte (-128 to 127)
np.int16int16_tInteger (-32768 to 32767)
np.int32int32_tInteger (-2147483648 to 2147483647)
np.int64int64_tInteger (-9223372036854775808 to 9223372036854775807)
np.uint8uint8_tUnsigned integer (0 to 255)
np.uint16uint16_tUnsigned integer (0 to 65535)
np.uint32uint32_tUnsigned integer (0 to 4294967295)
np.uint64uint64_tUnsigned integer (0 to 18446744073709551615)
np.intpintptr_tInteger used for indexing, typically the same as ssize_t
np.uintpuintptr_tInteger large enough to hold a pointer
np.float32float
np.float64 / np.float_doubleNote that this matches the precision of the builtin python float.
np.complex64float complexComplex number, represented by two 32-bit floats (real and imaginary components)
np.complex128 / np.complex_double complexNote that this matches the precision of the builtin python complex.

All of these types are instances of dtype objects. There are five basic types commonly used, namely bool, int, uint, float and complex.

The number after the type indicates the number of bytes occupied by the type.

There are some platform defined data types in the above table. These types are platform related and should be paid special attention to when using.

These dtype types can be manually specified when creating an array:

>>> import numpy as np
>>> x = np.float32(1.0)
>>> x
1.0
>>> y = np.int_([1,2,4])
>>> y
array([1, 2, 4])
>>> z = np.arange(3, dtype=np.uint8)
>>> z
array([0, 1, 2], dtype=uint8)

For historical reasons, for downward compatibility, we can also specify the dtype of character format when creating the array.

>>> np.array([1, 2, 3], dtype='f')
array([ 1.,  2.,  3.], dtype=float32)

The f above represents the float type.

Type conversion

If you want to convert an existing array type, you can use the astype method provided by the array or call the np cast method:

In [33]: z = np.arange(3, dtype=np.uint8)

In [34]: z
Out[34]: array([0, 1, 2], dtype=uint8)

In [35]: z.astype(float)
Out[35]: array([0., 1., 2.])

In [36]: np.int8(z)
Out[36]: array([0, 1, 2], dtype=int8)

Note that we used float above, and Python will automatically replace float with NP float_, The same simplified format is int = = NP int_, bool == np. bool_, complex == np. complex_. Other data types cannot use the simplified version.

View type

To view the data type of an array, you can use the built-in dtype attribute:

In [37]: z.dtype
Out[37]: dtype('uint8')

Some operations can also be performed as dtype itself:

>>> d = np.dtype(int)
>>> d
dtype('int32')

>>> np.issubdtype(d, np.integer)
True

>>> np.issubdtype(d, np.floating)
False

data overflow

Generally speaking, if it exceeds the range of data, an exception will be reported. For example, we have a very long int value:

In [38]: a= 1000000000000000000000000000000000000000000000000000000000000000000000000000000

In [39]: a
Out[39]: 1000000000000000000000000000000000000000000000000000000000000000000000000000000

In [40]: np.int(1000000000000000000000000000000000000000000000000000000)
Out[40]: 1000000000000000000000000000000000000000000000000000000

In [41]: np.int32(1000000000000000000000000000000000000000000000000000000)
---------------------------------------------------------------------------
OverflowError                             Traceback (most recent call last)
<ipython-input-41-71feb4433730> in <module>()
----> 1 np.int32(1000000000000000000000000000000000000000000000000000000)

If the above number is too long and exceeds the range of int32, an exception will be thrown.

However, if some operations of NumPy exceed the range, they will not report exceptions, but the normal range. At this time, we need to pay attention to:

In [43]: np.power(100, 8, dtype=np.int32)
Out[43]: 1874919424

In [44]: np.power(100, 8, dtype=np.int64)
Out[44]: 10000000000000000

NumPy provides two methods to measure the range of int and float, NumPy Iinfo and NumPy finfo :

In [45]:  np.iinfo(int)
Out[45]: iinfo(min=-9223372036854775808, max=9223372036854775807, dtype=int64)

In [46]: np.iinfo(np.int32)
Out[46]: iinfo(min=-2147483648, max=2147483647, dtype=int32)

In [47]: np.iinfo(np.int64)
Out[47]: iinfo(min=-9223372036854775808, max=9223372036854775807, dtype=int64)

If the 64 bit int is still too small, you can use NP Float64, float64 can use scientific counting method, so it can get a wider range of results, but its accuracy may be reduced.

In [48]: np.power(100, 100, dtype=np.int64)
Out[48]: 0

In [49]: np.power(100, 100, dtype=np.float64)
Out[49]: 1e+200

This article has been included in http://www.flydean.com/02-python-numpy-datatype/

The most popular interpretation, the most profound dry goods, the most concise tutorial, and many tips you don't know are waiting for you to find!

Welcome to my official account: "those things in procedure", understand technology, know you better!

Keywords: Python Programming Data Analysis numpy

Added by medar on Sat, 19 Feb 2022 16:23:16 +0200