My pandas dataframe:
我的熊猫dataframe:
dframe = pd.DataFrame({"A":list("abcde"), "B":list("aabbc"), "C":[1,2,3,4,5]}, index=[10,11,12,13,14])
A B C
10 a a 1
11 b a 2
12 c b 3
13 d b 4
14 e c 5
My desired output:
我的期望输出值:
A B C a b c
10 a a 1 1 None None
11 b a 2 2 None None
12 c b 3 None 3 None
13 d b 4 None 4 None
14 e c 5 None None 5
Idea is to create new column based on values in 'B' column, copy respective values in 'C' column and paste them in newly created columns. Here is my code:
想法是基于“B”列中的值创建新的列,在“C”列中复制各自的值,并将它们粘贴到新创建的列中。这是我的代码:
lis = sorted(list(dframe.B.unique()))
#creating empty columns
for items in lis:
dframe[items] = None
#here copy and pasting
for items in range(0, len(dframe)):
slot = dframe.B.iloc[items]
dframe[slot][items] = dframe.C.iloc[items]
I ended up with this error:
我最后犯了这个错误:
SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
app.launch_new_instance()
This code worked well in Python 2.7 but not in 3.x. Where I'm going wrong?
这段代码在Python 2.7中工作得很好,但在3.x中却不行。我去哪里错了吗?
2 个解决方案
#1
1
Start with
开始
to_be_appended = pd.get_dummies(dframe.B).replace(0, np.nan).mul(dframe.C, axis=0)
Then concat
然后concat
dframe = pd.concat([dframe, to_be_appended], axis=1)
Looks like:
看起来像:
print dframe
A B C a b c
10 a a 1 1.0 NaN NaN
11 b a 2 2.0 NaN NaN
12 c b 3 NaN 3.0 NaN
13 d b 4 NaN 4.0 NaN
14 e c 5 NaN NaN 5.0
Notes for searching.
笔记搜索。
This is combining one hot encoding with a broadcast multiplication.
这是一种热编码和广播乘法的结合。
#2
0
Chained assignment will now by default warn if the user is assigning to a copy.
如果用户正在分配一个副本,则默认的链接分配将警告。
This can be changed with the option mode.chained_assignment, allowed options are raise/warn/None. See the docs.
这可以通过选项模式进行更改。chained_assignment,允许的选项是raise/warn/None。看文档。
In [5]: dfc = DataFrame({'A':['aaa','bbb','ccc'],'B':[1,2,3]})
在[5]:dfc = DataFrame({ A:[“aaa”、“bbb”、“ccc”),B:[1,2,3]})
In [6]: pd.set_option('chained_assignment','warn')
在[6]:pd.set_option(“chained_assignment”、“警告”)
The following warning / exception will show if this is attempted.
如果尝试此操作,将显示以下警告/异常。
In [7]: dfc.loc[0]['A'] = 1111
在[7]:dfc。loc[0][A]= 1111
Traceback (most recent call last) ... SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_index,col_indexer] = value instead Here is the correct method of assignment.
回溯(最近一次通话)……SettingWithCopyWarning:一个值试图在DataFrame切片的副本上设置。尝试使用.loc[row_index,col_indexer] =值,这里是正确的分配方法。
In [8]: dfc.loc[0,'A'] = 11
在[8]:dfc。loc[0,' ')= 11
In [9]: dfc
在[9]:dfc
A B
0 11 1
0 11 - 1
1 bbb 2
1 bbb 2
2 ccc 3
2 ccc 3
#1
1
Start with
开始
to_be_appended = pd.get_dummies(dframe.B).replace(0, np.nan).mul(dframe.C, axis=0)
Then concat
然后concat
dframe = pd.concat([dframe, to_be_appended], axis=1)
Looks like:
看起来像:
print dframe
A B C a b c
10 a a 1 1.0 NaN NaN
11 b a 2 2.0 NaN NaN
12 c b 3 NaN 3.0 NaN
13 d b 4 NaN 4.0 NaN
14 e c 5 NaN NaN 5.0
Notes for searching.
笔记搜索。
This is combining one hot encoding with a broadcast multiplication.
这是一种热编码和广播乘法的结合。
#2
0
Chained assignment will now by default warn if the user is assigning to a copy.
如果用户正在分配一个副本,则默认的链接分配将警告。
This can be changed with the option mode.chained_assignment, allowed options are raise/warn/None. See the docs.
这可以通过选项模式进行更改。chained_assignment,允许的选项是raise/warn/None。看文档。
In [5]: dfc = DataFrame({'A':['aaa','bbb','ccc'],'B':[1,2,3]})
在[5]:dfc = DataFrame({ A:[“aaa”、“bbb”、“ccc”),B:[1,2,3]})
In [6]: pd.set_option('chained_assignment','warn')
在[6]:pd.set_option(“chained_assignment”、“警告”)
The following warning / exception will show if this is attempted.
如果尝试此操作,将显示以下警告/异常。
In [7]: dfc.loc[0]['A'] = 1111
在[7]:dfc。loc[0][A]= 1111
Traceback (most recent call last) ... SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_index,col_indexer] = value instead Here is the correct method of assignment.
回溯(最近一次通话)……SettingWithCopyWarning:一个值试图在DataFrame切片的副本上设置。尝试使用.loc[row_index,col_indexer] =值,这里是正确的分配方法。
In [8]: dfc.loc[0,'A'] = 11
在[8]:dfc。loc[0,' ')= 11
In [9]: dfc
在[9]:dfc
A B
0 11 1
0 11 - 1
1 bbb 2
1 bbb 2
2 ccc 3
2 ccc 3