根据另一列在pandas列中指定值

时间:2022-01-05 13:08:10

I have two dataframes like the ones that are shown below:

我有两个数据帧,如下所示:

A

Timestamp C1 C2 C3
1 0 0 0 
2 0 0 0
3 0 0 0
4 0 0 0
5 0 0 0
6 0 0 0
7 0 0 0

and B

Timestamp C1 C2 C3
2 0 0 0
3 v1 v2 v3
4 v4 v5 v6
7 0 0 0

I want to merge the two datasets and replace the zeros in A with the values in B based on the Timestamp column and have a new A dataframe like the one shown below:

我想合并两个数据集,并根据Timestamp列将A中的零替换为B中的值,并使用如下所示的新A数据帧:

Timestamp C1 C2 C3
1 0 0 0 
2 0 0 0
3 v1 v2 v3
4 v4 v5 v6
5 0 0 0
6 0 0 0
7 0 0 0

1 个解决方案

#1


0  

I think need mask for replace 0 to NaNs with combine_first:

我认为需要使用combine_first将0替换为NaNs的掩码:

#convert columns to indices if necessary
#A = A.set_index('Timestamp')
#B = B.set_index('Timestamp')

df = B.mask(B == 0).combine_first(A)
#alternative
#df = B.replace({0:np.nan}).combine_first(A)
print (df)
           C1  C2  C3
Timestamp            
1           0   0   0
2           0   0   0
3          v1  v2  v3
4          v4  v5  v6
5           0   0   0
6           0   0   0
7           0   0   0

#1


0  

I think need mask for replace 0 to NaNs with combine_first:

我认为需要使用combine_first将0替换为NaNs的掩码:

#convert columns to indices if necessary
#A = A.set_index('Timestamp')
#B = B.set_index('Timestamp')

df = B.mask(B == 0).combine_first(A)
#alternative
#df = B.replace({0:np.nan}).combine_first(A)
print (df)
           C1  C2  C3
Timestamp            
1           0   0   0
2           0   0   0
3          v1  v2  v3
4          v4  v5  v6
5           0   0   0
6           0   0   0
7           0   0   0