For example I have
比如我有
df=pd.DataFrame({'a':[1,2,3]})
df[df['a']==3].a = 4
This does not assign 4 to where 3 is
这不会将4分配给3
df[df['a']==3] = 4
But this works.
但这很有效。
It confused me on how the assignment works. Appreciate if anyone can give me some references or explanation.
它使我对如何分配工作感到困惑。感谢是否有人可以给我一些参考或解释。
5 个解决方案
#1
3
You do not want to use the second method. It returns a dataframe subslice and assigns the same value to every single row.
您不想使用第二种方法。它返回一个数据帧子切片,并为每一行分配相同的值。
For example,
例如,
df
a b
0 1 4
1 2 3
2 3 6
df[df['a'] == 3]
a b
2 3 6
df[df['a']==3] = 3
df
a b
0 1 4
1 2 3
2 3 3
The first method does not work because boolean indexing returns a copy of the column (series), which you are trying to assign to, so assignment fails:
第一种方法不起作用,因为布尔索引返回您尝试分配给的列(系列)的副本,因此赋值失败:
df[df['a'] == 3].a = 4
/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/pandas/core/generic.py:3110: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
self[name] = value
So, your options are using .loc
(access by name) or iloc
(access by index) based indexing:
因此,您的选项使用基于索引的.loc(按名称访问)或iloc(按索引访问):
df.loc[df.a == 3, 'a'] = 4
df
a
0 1
1 2
2 4
If you are passing a boolean mask, you cannot use iloc
.
如果要传递布尔掩码,则不能使用iloc。
#2
2
Use .loc
with boolean index and column label selection:
使用.loc与布尔索引和列标签选择:
df.loc[df.a == 3,'a'] = 4
print(df)
Output:
输出:
a
0 1
1 2
2 4
In your method what is happening is that you are slicing your dataframe and pandas is creating a copy and that assignment is happening on the copy of the dataframe and not the original dataframe itself.
在您的方法中,发生的事情是您正在切割数据帧,并且pandas正在创建副本,并且该分配发生在数据帧的副本而不是原始数据帧本身。
#4
1
Or you can do it like this
或者你可以这样做
df['a'] = df['a'].replace(3, 4)
(modified, thanks @COLDSPEED)
(修改,谢谢@COLDSPEED)
#5
0
you would want to do
你想要做的
df['a'].apply(lambda x: 4 if x ==3 else x)
which would give:
这会给:
0 1
1 2
2 4
#1
3
You do not want to use the second method. It returns a dataframe subslice and assigns the same value to every single row.
您不想使用第二种方法。它返回一个数据帧子切片,并为每一行分配相同的值。
For example,
例如,
df
a b
0 1 4
1 2 3
2 3 6
df[df['a'] == 3]
a b
2 3 6
df[df['a']==3] = 3
df
a b
0 1 4
1 2 3
2 3 3
The first method does not work because boolean indexing returns a copy of the column (series), which you are trying to assign to, so assignment fails:
第一种方法不起作用,因为布尔索引返回您尝试分配给的列(系列)的副本,因此赋值失败:
df[df['a'] == 3].a = 4
/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/pandas/core/generic.py:3110: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
self[name] = value
So, your options are using .loc
(access by name) or iloc
(access by index) based indexing:
因此,您的选项使用基于索引的.loc(按名称访问)或iloc(按索引访问):
df.loc[df.a == 3, 'a'] = 4
df
a
0 1
1 2
2 4
If you are passing a boolean mask, you cannot use iloc
.
如果要传递布尔掩码,则不能使用iloc。
#2
2
Use .loc
with boolean index and column label selection:
使用.loc与布尔索引和列标签选择:
df.loc[df.a == 3,'a'] = 4
print(df)
Output:
输出:
a
0 1
1 2
2 4
In your method what is happening is that you are slicing your dataframe and pandas is creating a copy and that assignment is happening on the copy of the dataframe and not the original dataframe itself.
在您的方法中,发生的事情是您正在切割数据帧,并且pandas正在创建副本,并且该分配发生在数据帧的副本而不是原始数据帧本身。
#3
#4
1
Or you can do it like this
或者你可以这样做
df['a'] = df['a'].replace(3, 4)
(modified, thanks @COLDSPEED)
(修改,谢谢@COLDSPEED)
#5
0
you would want to do
你想要做的
df['a'].apply(lambda x: 4 if x ==3 else x)
which would give:
这会给:
0 1
1 2
2 4