My dataframe df1:
我的数据帧df1:
date, country, category, score, value
2017-01-01, US, 123, 555, 232.02
2017-01-01, US, 223, 10, 22.02
I have a lookup dataframe df2:
我有一个查找数据帧df2:
category, factor_score_0_100, factor_score_101_500, factor_score_501_1000
123, 2.0, 3.0, 4.0
223, 5.4, 4.3, 3.2
Based on the category
and score
of a row in df1
, I need to get the factor_score
from df2. If the score in df1
for a particular category is between 0 and 100, I need to return factor_score_0_100
for that category and so on.
基于df1中行的类别和分数,我需要从df2获取factor_score。如果特定类别的df1得分在0到100之间,我需要为该类别返回factor_score_0_100,依此类推。
So far I've been able to convert df2
into a dictionary of the form
到目前为止,我已经能够将df2转换为表单的字典
category: [factor_score_0_100, factor_score_101_500, factor_score_501_1000]
And I was attempting to write a function and then apply
it via a lambda, but I'm not sure how to use 2 columns as an input.
我试图编写一个函数,然后通过lambda应用它,但我不知道如何使用2列作为输入。
How can I proceed here? TIA
我该怎么办? TIA
1 个解决方案
#1
0
A little bit hack to get that using IntervalIndex
+ lookup
使用IntervalIndex +查找有点破解
df2=df2.set_index('category')
df2.columns=df2.columns.str.split('_',expand=True)
idx=pd.IntervalIndex.from_arrays(df2.columns.get_level_values(2).astype(int),df2.columns.get_level_values(3).astype(int),closed='both')
df2.columns=idx
df2.lookup(df1[' category'],df1[' score'])
Out[171]: array([4. , 5.4])
After assign it back
分配后
df1['NEW']=df2.lookup(df1[' category'],df1[' score'])
df1
Out[173]:
date country category score value NEW
0 2017-01-01 US 123 555 232.02 4.0
1 2017-01-01 US 223 10 22.02 5.4
#1
0
A little bit hack to get that using IntervalIndex
+ lookup
使用IntervalIndex +查找有点破解
df2=df2.set_index('category')
df2.columns=df2.columns.str.split('_',expand=True)
idx=pd.IntervalIndex.from_arrays(df2.columns.get_level_values(2).astype(int),df2.columns.get_level_values(3).astype(int),closed='both')
df2.columns=idx
df2.lookup(df1[' category'],df1[' score'])
Out[171]: array([4. , 5.4])
After assign it back
分配后
df1['NEW']=df2.lookup(df1[' category'],df1[' score'])
df1
Out[173]:
date country category score value NEW
0 2017-01-01 US 123 555 232.02 4.0
1 2017-01-01 US 223 10 22.02 5.4