datetime64 Foundation
In numpy, we can easily convert the string to the time date type datetime64 (datetime has been occupied by the date time library contained in python).
datatime64 is a date time type with units as follows:
[example] when creating datetime64 type from string, by default, numpy will automatically select the corresponding unit according to the string.
import numpy as np a = np.datetime64('2020-03-01') print(a, a.dtype) # 2020-03-01 datetime64[D] a = np.datetime64('2020-03') print(a, a.dtype) # 2020-03 datetime64[M] a = np.datetime64('2020-03-08 20:00:05') print(a, a.dtype) # 2020-03-08T20:00:05 datetime64[s] a = np.datetime64('2020-03-08 20:00') print(a, a.dtype) # 2020-03-08T20:00 datetime64[m] a = np.datetime64('2020-03-08 20') print(a, a.dtype) # 2020-03-08T20 datetime64[h]
[example] when creating datetime64 type from string, the unit used can be forcibly specified.
import numpy as np a = np.datetime64('2020-03', 'D') print(a, a.dtype) # 2020-03-01 datetime64[D] a = np.datetime64('2020-03', 'Y') print(a, a.dtype) # 2020 datetime64[Y] print(np.datetime64('2020-03') == np.datetime64('2020-03-01')) # True print(np.datetime64('2020-03') == np.datetime64('2020-03-02')) #False
As can be seen from the above example, 2020-03 and 2020-03-01 actually represent the same time. In fact, if two datetime64 objects have different units, they may still represent the same time. And it is safe to convert from a larger unit (such as month) to a smaller unit (such as days).
[example] when creating a datetime64 array from a string, if the units are not uniform, they will be converted to the smallest unit.
import numpy as np a = np.array(['2020-03', '2020-03-08', '2020-03-08 20:00'], dtype='datetime64') print(a, a.dtype) # ['2020-03-01T00:00' '2020-03-08T00:00' '2020-03-08T20:00'] datetime64[m]
[example] create a datetime64 array with range() to generate a date range.
import numpy as np a = np.arange('2020-08-01', '2020-08-10', dtype=np.datetime64) print(a) # ['2020-08-01' '2020-08-02' '2020-08-03' '2020-08-04' '2020-08-05' # '2020-08-06' '2020-08-07' '2020-08-08' '2020-08-09'] print(a.dtype) # datetime64[D] a = np.arange('2020-08-01 20:00', '2020-08-10', dtype=np.datetime64) print(a) # ['2020-08-01T20:00' '2020-08-01T20:01' '2020-08-01T20:02' ... # '2020-08-09T23:57' '2020-08-09T23:58' '2020-08-09T23:59'] print(a.dtype) # datetime64[m] a = np.arange('2020-05', '2020-12', dtype=np.datetime64) print(a) # ['2020-05' '2020-06' '2020-07' '2020-08' '2020-09' '2020-10' '2020-11'] print(a.dtype) # datetime64[M]
datetime64 and timedelta64 operations
[example] timedelta64 indicates the difference between two datetime64. Timedelta 64 is also in units and is consistent with the smaller unit of the two datetime64 in the subtraction operation.
import numpy as np a = np.datetime64('2020-03-08') - np.datetime64('2020-03-07') b = np.datetime64('2020-03-08') - np.datetime64('202-03-07 08:00') c = np.datetime64('2020-03-08') - np.datetime64('2020-03-07 23:00', 'D') print(a, a.dtype) # 1 days timedelta64[D] print(b, b.dtype) # 956178240 minutes timedelta64[m] print(c, c.dtype) # 1 days timedelta64[D] a = np.datetime64('2020-03') + np.timedelta64(20, 'D') b = np.datetime64('2020-06-15 00:00') + np.timedelta64(12, 'h') print(a, a.dtype) # 2020-03-21 datetime64[D] print(b, b.dtype) # 2020-06-15T12:00 datetime64[m]
[example] when generating timedelta64, it should be noted that the two units of year ('Y ') and month ('M') cannot be calculated with other units (how many days are there in a year? How many hours are there in a month? These are uncertain).
import numpy as np a = np.timedelta64(1, 'Y') b = np.timedelta64(a, 'M') print(a) # 1 years print(b) # 12 months c = np.timedelta64(1, 'h') d = np.timedelta64(c, 'm') print(c) # 1 hours print(d) # 60 minutes print(np.timedelta64(a, 'D')) # TypeError: Cannot cast NumPy timedelta64 scalar from metadata [Y] # to [D] according to the rule 'same_kind' print(np.timedelta64(b, 'D')) # TypeError: Cannot cast NumPy timedelta64 scalar from metadata [M] # to [D] according to the rule 'same_kind'
[example] timedelta64 operation.
import numpy as np a = np.timedelta64(1, 'Y') b = np.timedelta64(6, 'M') c = np.timedelta64(1, 'W') d = np.timedelta64(1, 'D') e = np.timedelta64(10, 'D') print(a) # 1 years print(b) # 6 months print(a + b) # 18 months print(a - b) # 6 months print(2 * a) # 2 years print(a / b) # 2.0 print(c / d) # 7.0 print(c % e) # 7 days
[example] numpy Datetime64 and datetime Datetime conversion
import numpy as np import datetime dt = datetime.datetime(year=2020, month=6, day=1, hour=20, minute=5, second=30) dt64 = np.datetime64(dt, 's') print(dt64, dt64.dtype) # 2020-06-01T20:05:30 datetime64[s] dt2 = dt64.astype(datetime.datetime) print(dt2, type(dt2)) # 2020-06-01 20:05:30 <class 'datetime.datetime'>
Application of datetime64
To allow the use of dates and times in a context where only certain days of the week are valid, NumPy includes a set of "busday" functions.
-
numpy.busday_offset(dates, offsets, roll='raise', weekmask='1111100',holidays=None, busdaycal=None, out=None)
First adjust the date to belong to the effective date according to the rolling rules, and then apply the offset to the given date calculated in the effective date.
Parameter roll: {raise ',' nat ',' forward ',' following ',' backward ',' preceding ',' modified following ',' modified preceding '}
- 'raise' means to raise an exception for an invalid day.
- 'nat' means to return a NaT (not-a-time) for an invalid day.
- 'forward' and 'following' mean to take the first valid day later in
time. - 'backward' and 'preceding' mean to take the first valid day earlier
in time.
[example] apply the specified offset to working days, in days ('D '). Calculate the next working day. If the current date is a non working day, an error will be reported by default. You can specify forward or backward rules to avoid error reporting. (one is to take the first valid working day forward and the other is to take the first valid working day backward)
import numpy as np # Friday, July 10, 2020 a = np.busday_offset('2020-07-10', offsets=1) print(a) # 2020-07-13 a = np.busday_offset('2020-07-11', offsets=1) print(a) # ValueError: Non-business day date in busday_offset a = np.busday_offset('2020-07-11', offsets=0, roll='forward') b = np.busday_offset('2020-07-11', offsets=0, roll='backward') print(a) # 2020-07-13 print(b) # 2020-07-10 a = np.busday_offset('2020-07-11', offsets=1, roll='forward') b = np.busday_offset('2020-07-11', offsets=1, roll='backward') print(a) # 2020-07-14 print(b) # 2020-07-13
You can specify an offset of 0 to get the latest working day forward or backward of the current date. Of course, if the current date itself is a working day, you can directly return the current date.
- numpy.is_busday(dates, weekmask='1111100', holidays=None,busdaycal=None, out=None) Calculates which of the given dates are valid days, and which are not.
[example] Returns whether the specified date is a working day.
import numpy as np # Friday, July 10, 2020 a = np.is_busday('2020-07-10') b = np.is_busday('2020-07-11') print(a) # True print(b) # False
[example] count the number of working days in a datetime64[D] array.
import numpy as np # Friday, July 10, 2020 begindates = np.datetime64('2020-07-10') enddates = np.datetime64('2020-07-20') a = np.arange(begindates, enddates, dtype='datetime64') b = np.count_nonzero(np.is_busday(a)) print(a) # ['2020-07-10' '2020-07-11' '2020-07-12' '2020-07-13' '2020-07-14' # '2020-07-15' '2020-07-16' '2020-07-17' '2020-07-18' '2020-07-19'] print(b) # 6
[example] customize the week mask value, that is, specify which weeks of the week are working days.
import numpy as np # Friday, July 10, 2020 a = np.is_busday('2020-07-10', weekmask=[1, 1, 1, 1, 1, 0, 0]) b = np.is_busday('2020-07-10', weekmask=[1, 1, 1, 1, 0, 0, 1]) print(a) # True print(b) # False
- numpy.busday_count(begindates, enddates,weekmask='1111100',holidays=[], busdaycal=None,out=None)Counts the number of valid days between begindates and enddates, not including the day of enddates.
[example] returns the number of working days between two dates.
import numpy as np # Friday, July 10, 2020 begindates = np.datetime64('2020-07-10') enddates = np.datetime64('2020-07-20') a = np.busday_count(begindates, enddates) b = np.busday_count(enddates, begindates) print(a) # 6 print(b) # -6