this is a rather similar question to this question but with one key difference: I'm selecting the data I want to change not by its index but by some criteria.
对于这个问题,这是一个相当类似的问题,但有一个关键的区别:我选择的数据不是通过索引而是通过某些标准来改变。
If the criteria I apply return a single row, I'd expect to be able to set the value of a certain column in that row in an easy way, but my first attempt doesn't work:
如果我应用的条件返回单行,我希望能够以一种简单的方式设置该行中某列的值,但我的第一次尝试不起作用:
>>> d = pd.DataFrame({'year':[2008,2008,2008,2008,2009,2009,2009,2009],
... 'flavour':['strawberry','strawberry','banana','banana',
... 'strawberry','strawberry','banana','banana'],
... 'day':['sat','sun','sat','sun','sat','sun','sat','sun'],
... 'sales':[10,12,22,23,11,13,23,24]})
>>> d
day flavour sales year
0 sat strawberry 10 2008
1 sun strawberry 12 2008
2 sat banana 22 2008
3 sun banana 23 2008
4 sat strawberry 11 2009
5 sun strawberry 13 2009
6 sat banana 23 2009
7 sun banana 24 2009
>>> d[d.sales==24]
day flavour sales year
7 sun banana 24 2009
>>> d[d.sales==24].sales = 100
>>> d
day flavour sales year
0 sat strawberry 10 2008
1 sun strawberry 12 2008
2 sat banana 22 2008
3 sun banana 23 2008
4 sat strawberry 11 2009
5 sun strawberry 13 2009
6 sat banana 23 2009
7 sun banana 24 2009
So rather than setting 2009 Sunday's Banana sales to 100, nothing happens! What's the nicest way to do this? Ideally the solution should use the row number, as you normally don't know that in advance!
因此,没有将2009年周日的香蕉销量设定为100,而是没有任何反应!最好的方法是什么?理想情况下,解决方案应使用行号,因为您通常不提前知道!
Many thanks in advance, Rob
非常感谢,Rob
2 个解决方案
#1
46
Many ways to do that
许多方法都是这样做的
1
In [7]: d.sales[d.sales==24] = 100
In [8]: d
Out[8]:
day flavour sales year
0 sat strawberry 10 2008
1 sun strawberry 12 2008
2 sat banana 22 2008
3 sun banana 23 2008
4 sat strawberry 11 2009
5 sun strawberry 13 2009
6 sat banana 23 2009
7 sun banana 100 2009
2
In [26]: d.loc[d.sales == 12, 'sales'] = 99
In [27]: d
Out[27]:
day flavour sales year
0 sat strawberry 10 2008
1 sun strawberry 99 2008
2 sat banana 22 2008
3 sun banana 23 2008
4 sat strawberry 11 2009
5 sun strawberry 13 2009
6 sat banana 23 2009
7 sun banana 100 2009
3
In [28]: d.sales = d.sales.replace(23, 24)
In [29]: d
Out[29]:
day flavour sales year
0 sat strawberry 10 2008
1 sun strawberry 99 2008
2 sat banana 22 2008
3 sun banana 24 2008
4 sat strawberry 11 2009
5 sun strawberry 13 2009
6 sat banana 24 2009
7 sun banana 100 2009
#2
6
Not sure about older version of pandas, but in 0.16 the value of a particular cell can be set based on multiple column values.
不确定旧版本的pandas,但在0.16中,可以根据多个列值设置特定单元格的值。
Extending the answer provided by @waitingkuo, the same operation can also be done based on values of multiple columns.
扩展@waitingkuo提供的答案,也可以根据多列的值完成相同的操作。
d.loc[(d.day== 'sun') & (d.flavour== 'banana') & (d.year== 2009),'sales'] = 100
#1
46
Many ways to do that
许多方法都是这样做的
1
In [7]: d.sales[d.sales==24] = 100
In [8]: d
Out[8]:
day flavour sales year
0 sat strawberry 10 2008
1 sun strawberry 12 2008
2 sat banana 22 2008
3 sun banana 23 2008
4 sat strawberry 11 2009
5 sun strawberry 13 2009
6 sat banana 23 2009
7 sun banana 100 2009
2
In [26]: d.loc[d.sales == 12, 'sales'] = 99
In [27]: d
Out[27]:
day flavour sales year
0 sat strawberry 10 2008
1 sun strawberry 99 2008
2 sat banana 22 2008
3 sun banana 23 2008
4 sat strawberry 11 2009
5 sun strawberry 13 2009
6 sat banana 23 2009
7 sun banana 100 2009
3
In [28]: d.sales = d.sales.replace(23, 24)
In [29]: d
Out[29]:
day flavour sales year
0 sat strawberry 10 2008
1 sun strawberry 99 2008
2 sat banana 22 2008
3 sun banana 24 2008
4 sat strawberry 11 2009
5 sun strawberry 13 2009
6 sat banana 24 2009
7 sun banana 100 2009
#2
6
Not sure about older version of pandas, but in 0.16 the value of a particular cell can be set based on multiple column values.
不确定旧版本的pandas,但在0.16中,可以根据多个列值设置特定单元格的值。
Extending the answer provided by @waitingkuo, the same operation can also be done based on values of multiple columns.
扩展@waitingkuo提供的答案,也可以根据多列的值完成相同的操作。
d.loc[(d.day== 'sun') & (d.flavour== 'banana') & (d.year== 2009),'sales'] = 100