从熊猫的日期时间

时间:2021-12-07 22:55:26

So I have a 'Date' column in my data frame where the dates have the format like this

所以我的数据框中有一个“日期”列,日期的格式如下

0    1998-08-26 04:00:00 

If I only want the Year month and day how do I drop the trivial hour?

如果我只想要年月和日,我怎么放弃琐碎的小时?

2 个解决方案

#1


31  

The quickest way is to use DatetimeIndex's normalize (you first need to make the column a DatetimeIndex):

最快的方法是使用DatetimeIndex的规范化(首先需要使列成为DatetimeIndex):

In [11]: df = pd.DataFrame({"t": pd.date_range('2014-01-01', periods=5, freq='H')})

In [12]: df
Out[12]:
                    t
0 2014-01-01 00:00:00
1 2014-01-01 01:00:00
2 2014-01-01 02:00:00
3 2014-01-01 03:00:00
4 2014-01-01 04:00:00

In [13]: pd.DatetimeIndex(df.t).normalize()
Out[13]:
<class 'pandas.tseries.index.DatetimeIndex'>
[2014-01-01, ..., 2014-01-01]
Length: 5, Freq: None, Timezone: None

In [14]: df['date'] = pd.DatetimeIndex(df.t).normalize()

In [15]: df
Out[15]:
                    t       date
0 2014-01-01 00:00:00 2014-01-01
1 2014-01-01 01:00:00 2014-01-01
2 2014-01-01 02:00:00 2014-01-01
3 2014-01-01 03:00:00 2014-01-01
4 2014-01-01 04:00:00 2014-01-01

DatetimeIndex also has some other useful attributes, e.g. .year, .month, .day.

DatetimeIndex还有一些其他有用的属性,例如。年月日。


From 0.15 they'll be a dt attribute, so you can access this (and other methods) with:

从0.15开始,它们将是dt属性,因此您可以使用以下命令访问此(以及其他方法):

df.t.dt.normalize()
# equivalent to
pd.DatetimeIndex(df.t).normalize()

#2


0  

Another Possibility is using str.split

另一种可能性是使用str.split

df['Date'] = df['Date'].str.split(' ',expand=True)[0]

This should split the 'Date' column into two columns marked 0 and 1. Using the whitespace in between the date and time as the split indicator.

这应该将“日期”列拆分为标记为0和1的两列。使用日期和时间之间的空格作为拆分指示符。

Column 0 of the returned dataframe then includes the date, and column 1 includes the time. Then it sets the 'Date' column of your original dataframe to column [0] which should be just the date.

然后,返回的数据帧的第0列包括日期,第1列包括时间。然后它将原始数据框的“日期”列设置为列[0],该列应该只是日期。

#1


31  

The quickest way is to use DatetimeIndex's normalize (you first need to make the column a DatetimeIndex):

最快的方法是使用DatetimeIndex的规范化(首先需要使列成为DatetimeIndex):

In [11]: df = pd.DataFrame({"t": pd.date_range('2014-01-01', periods=5, freq='H')})

In [12]: df
Out[12]:
                    t
0 2014-01-01 00:00:00
1 2014-01-01 01:00:00
2 2014-01-01 02:00:00
3 2014-01-01 03:00:00
4 2014-01-01 04:00:00

In [13]: pd.DatetimeIndex(df.t).normalize()
Out[13]:
<class 'pandas.tseries.index.DatetimeIndex'>
[2014-01-01, ..., 2014-01-01]
Length: 5, Freq: None, Timezone: None

In [14]: df['date'] = pd.DatetimeIndex(df.t).normalize()

In [15]: df
Out[15]:
                    t       date
0 2014-01-01 00:00:00 2014-01-01
1 2014-01-01 01:00:00 2014-01-01
2 2014-01-01 02:00:00 2014-01-01
3 2014-01-01 03:00:00 2014-01-01
4 2014-01-01 04:00:00 2014-01-01

DatetimeIndex also has some other useful attributes, e.g. .year, .month, .day.

DatetimeIndex还有一些其他有用的属性,例如。年月日。


From 0.15 they'll be a dt attribute, so you can access this (and other methods) with:

从0.15开始,它们将是dt属性,因此您可以使用以下命令访问此(以及其他方法):

df.t.dt.normalize()
# equivalent to
pd.DatetimeIndex(df.t).normalize()

#2


0  

Another Possibility is using str.split

另一种可能性是使用str.split

df['Date'] = df['Date'].str.split(' ',expand=True)[0]

This should split the 'Date' column into two columns marked 0 and 1. Using the whitespace in between the date and time as the split indicator.

这应该将“日期”列拆分为标记为0和1的两列。使用日期和时间之间的空格作为拆分指示符。

Column 0 of the returned dataframe then includes the date, and column 1 includes the time. Then it sets the 'Date' column of your original dataframe to column [0] which should be just the date.

然后,返回的数据帧的第0列包括日期,第1列包括时间。然后它将原始数据框的“日期”列设置为列[0],该列应该只是日期。