如何按特定月/日过滤日期数据框?

时间:2022-09-15 15:52:07

So my code is as follows:

所以我的代码如下:

df['Dates'][df['Dates'].index.month == 11]

I was doing a test to see if I could filter the months so it only shows November dates, but this did not work. It gives me the following error: AttributeError: 'Int64Index' object has no attribute 'month'.

我正在做一个测试,看看我是否可以过滤几个月,所以它只显示11月的日期,但这不起作用。它给出了以下错误:AttributeError:'Int64Index'对象没有属性'month'。

If I do

如果我做

print type(df['Dates'][0])

then I get class 'pandas.tslib.Timestamp', which leads me to believe that the types of objects stored in the dataframe are Timestamp objects. (I'm not sure where the 'Int64Index' is coming from... for the error before)

然后我得到类'pandas.tslib.Timestamp',这让我相信存储在数据帧中的对象类型是Timestamp对象。 (我不确定'Int64Index'来自哪里......以前的错误)

What I want to do is this: The dataframe column contains dates from the early 2000's to present in the following format: dd/mm/yyyy. I want to filter for dates only between November 15 and March 15, independent of the YEAR. What is the easiest way to do this?

我想要做的是:数据框列包含从2000年代早期到现在的日期格式:dd / mm / yyyy。我想仅在11月15日到3月15日期间过滤日期,与年份无关。最简单的方法是什么?

Thanks.

Here is df['Dates'] (with indices):

这是df ['Dates'](带索引):

0    2006-01-01
1    2006-01-02
2    2006-01-03
3    2006-01-04
4    2006-01-05
5    2006-01-06
6    2006-01-07
7    2006-01-08
8    2006-01-09
9    2006-01-10
10   2006-01-11
11   2006-01-12
12   2006-01-13
13   2006-01-14
14   2006-01-15
...

2 个解决方案

#1


12  

Map an anonymous function to calculate the month on to the series and compare it to 11 for nov. That will give you a boolean mask. You can then use that mask to filter your dataframe.

映射匿名函数以计算系列的月份,并将其与11的nov进行比较。这会给你一个布尔掩码。然后,您可以使用该掩码来过滤数据帧。

nov_mask = df['Dates'].map(lambda x: x.month) == 11
df[nov_mask]

I don't think there is straight forward way to filter the way you want ignoring the year so try this.

我认为没有直接的方法来过滤你想要忽略年份的方式所以试试这个。

nov_mar_series = pd.Series(pd.date_range("2013-11-15", "2014-03-15"))
#create timestamp without year
nov_mar_no_year = nov_mar_series.map(lambda x: x.strftime("%m-%d"))
#add a yearless timestamp to the dataframe
df["no_year"] = df['Date'].map(lambda x: x.strftime("%m-%d"))
no_year_mask = df['no_year'].isin(nov_mar_no_year)
df[no_year_mask]

#2


0  

In your code there are two issues. First, need to bring column reference after the filtering condition. Second, can either use ".month" with a column or index, but not both. One of the following should work:

在您的代码中有两个问题。首先,需要在过滤条件后引入列引用。其次,可以将“.month”与列或索引一起使用,但不能同时使用两者。以下之一应该工作:

df[df.index.month == 11]['Dates']

df[df['Dates'].month == 11]['Dates']

#1


12  

Map an anonymous function to calculate the month on to the series and compare it to 11 for nov. That will give you a boolean mask. You can then use that mask to filter your dataframe.

映射匿名函数以计算系列的月份,并将其与11的nov进行比较。这会给你一个布尔掩码。然后,您可以使用该掩码来过滤数据帧。

nov_mask = df['Dates'].map(lambda x: x.month) == 11
df[nov_mask]

I don't think there is straight forward way to filter the way you want ignoring the year so try this.

我认为没有直接的方法来过滤你想要忽略年份的方式所以试试这个。

nov_mar_series = pd.Series(pd.date_range("2013-11-15", "2014-03-15"))
#create timestamp without year
nov_mar_no_year = nov_mar_series.map(lambda x: x.strftime("%m-%d"))
#add a yearless timestamp to the dataframe
df["no_year"] = df['Date'].map(lambda x: x.strftime("%m-%d"))
no_year_mask = df['no_year'].isin(nov_mar_no_year)
df[no_year_mask]

#2


0  

In your code there are two issues. First, need to bring column reference after the filtering condition. Second, can either use ".month" with a column or index, but not both. One of the following should work:

在您的代码中有两个问题。首先,需要在过滤条件后引入列引用。其次,可以将“.month”与列或索引一起使用,但不能同时使用两者。以下之一应该工作:

df[df.index.month == 11]['Dates']

df[df['Dates'].month == 11]['Dates']