This article takes you to understand the time processing in pandas (detailed)

catalogue

1. Six time related classes in pandas 2. Timestamp class 1) Check whether the time column is a str character string column or a time format column   2) use pd.to_datetime() converts a string to date format 3) The Timestamp class can only represent the time from 1677 to 2262   4) common properties of Timestamp class 3. DatetimeIndex and PeriodIndex functions: similar to to_datetime() function 4. Timedelta class 1) Move the date forward and backward by one day   2) two time difference

1. Six time related classes in pandas

In most cases, the premise of analyzing time type data is to convert the time originally as a string into a standard time type. pandas inherits the time-related modules of NumPy library and datetime library, and provides six time-related classes.

2. Timestamp class

  • Among them, Timestamp is the most basic and commonly used time class. In most cases, time-dependent strings are converted to Timestamp. pandas provides to_ The datetime () function can achieve this goal.
  • It is worth noting that the Timestamp type time is limited.
1) Check whether the time column is a str character string column or a time format column
import pandas as pd

df = pd.read_csv(r"E:\Computer video recording software\Video download installation path\Python Data analysis and application people mail edition\data\meal_order_info.csv",
                engine="python",
                encoding="gbk")
df['lock_time'].head()
type(df['lock_time'][0])
pd.to_datetime(df['lock_time']).head()

The results are as follows:

2) Use pd.to_datetime() converts a string to date format
import pandas as pd

df = pd.read_csv(r"E:\Computer video recording software\Video download installation path\Python Data analysis and application people mail edition\data\meal_order_info.csv",
                engine="python",
                encoding="gbk")
df['lock_time'] = pd.to_datetime(df['lock_time'])
df['lock_time'].head()

The results are as follows:

3) The Timestamp class can only represent the time from 1677 to 2262
pd.Timestamp.min
pd.Timestamp.max

The results are as follows:

4) Common properties of Timestamp class
  • In the process of time-related data processing and statistical analysis, it is necessary to extract the year, month and other data in the time. This can be achieved by using the corresponding Timestamp class attribute.
  • Combined with Python list derivation, a column of time information data of DataFrame can be extracted.

The operation is as follows:

import pandas as pd

df = pd.read_csv(r"E:\Computer video recording software\Video download installation path\Python Data analysis and application people mail edition\data\meal_order_info.csv",
                engine="python",
                encoding="gbk")
df['lock_time'] = pd.to_datetime(df['lock_time'])
df["year"] = df['lock_time'].apply(lambda x:x.year)
df["year"].head()

The results are as follows:

5) Extract the date in the specified format by using the strftime() method

df['lock_time'][0] df['lock_time'][0].strftime("%Y-%m")

The results are as follows:

3. DatetimeIndex and PeriodIndex functions: similar to to_datetime() function

  • In addition to directly converting the data word from the original DataFrame to Timestamp format, you can also extract the data separately and convert it to DatetimeIndex or PeriodIndex.
  • When converting to PeriodIndex, it should be noted that the time interval must be specified through the freq parameter. The common time intervals are Y as year, M as month, D as day, H as hour, T as minute and S as second. The two functions can be used to convert data and create time series data, and their parameters are very similar.
import pandas as pd

df = pd.read_csv(r"E:\Computer video recording software\Video download installation path\Python Data analysis and application people mail edition\data\meal_order_info.csv",
                engine="python",
                encoding="gbk")
df['lock_time'] = pd.DatetimeIndex(df['lock_time'])
df['lock_time'][0]
---------------------------------------------------------------
df = pd.read_csv(r"E:\Computer video recording software\Video download installation path\Python Data analysis and application people mail edition\data\meal_order_info.csv",
                engine="python",
                encoding="gbk")
df['lock_time'] = pd.PeriodIndex(df['lock_time'],freq="S")
df['lock_time'][0]

The results are as follows:

4. Timedelta class

  • Timedelta is a heterogeneous class of time-related classes. It can not only use positive numbers, but also use negative numbers to represent unit time, such as 1 second, 2 minutes, 3 hours, etc. Using timedelta class and conventional time-related classes can easily realize the arithmetic operation of time. At present, there are no years and months in the time period in the timedelta function.
1) Move the date forward and backward by one day
import pandas as pd

df = pd.read_csv(r"E:\Computer video recording software\Video download installation path\Python Data analysis and application people mail edition\data\meal_order_info.csv",
                engine="python",
                encoding="gbk")
df['lock_time'] = pd.to_datetime(df['lock_time'])
df['lock_time'][0]
# Move back one day
df['lock_time'][0] + pd.Timedelta(days=1)
# Move forward one day
df['lock_time'][0] + pd.Timedelta(days=-1)

The results are as follows:

2) Two time difference
  • Using Timedelta, you can easily add or subtract a period of time at a certain time. In addition to using Timedelta to realize time translation, it can also directly subtract two time series to obtain a Timedelta.
df['lock_time'][0]
pd.to_datetime("2020-3-13") - df['lock_time'][0]

a = pd.to_datetime("2020-3-13") - df['lock_time'][0]
a.days

The results are as follows:

To sum up: for the above six methods, as long as str is converted to date format, you can uniformly use the following common attributes of Timestamp class to extract year and month.

Added by davidprogramer on Thu, 25 Nov 2021 05:18:44 +0200