R使用ifelse语句编写数据表

时间:2021-06-24 12:31:47

I am trying perform a comparison on each row of a datatable and then extract a row from another table based on the result of the comparison. If the index that is to be looked up is greater than the length of the lookup table then a calculation needs to be performed. Here is what I have. The table I am iterating over is Indicators and looks like the following

我正在尝试对数据表的每一行执行比较,然后根据比较结果从另一个表中提取一行。如果要查找的索引大于查找表的长度,则需要执行计算。这就是我所拥有的。我正在迭代的表是指标,如下所示

Row, Val.A, Val.B
1,   30,    20.0
2,   3,     40.0
3,   1,     100.0
...

The table I am looking up rows from is Loading and looks like this:

我正在查找行的表是正在加载,看起来像这样:

Index, Zone.A, Zone.B, Zone.C, Zone.D, Zone.E
1,     10.0,   20.0,   1.00,   23.0,   34.5
2,     20.0,   40.0,   10.0,   34.5,   54.0
3,     40.0,   100.0,  100.0,  67.8,   98.2
...
10,    10.0,   10.0,   10.0,   10.0,   10.0 

What I am trying to do is use ifelse() or apply() for this problem but it is not working. The goal is to look up the row in the Loading table which corresponds to the value of Val.A in the Indicators table and to perform a calculation when there is no data in Loading. The code I am trying to use for this is the following:

我想要做的是使用ifelse()或apply()来解决这个问题,但它无法正常工作。目标是在“加载”表中查找与“指标”表中的Val.A值对应的行,并在“加载”中没有数据时执行计算。我试图用于此的代码如下:

max.index <- max(Loading[,1])
result <- ifelse(Indicators$Val.A < max.index, 
     Loading[[Indicators$Val.A,2:6]], 
     Loading[[max.index,2:6]] * Indicators$Val.A

)

Using the data shown the goal result for Indicators would be:

使用显示的数据,指标的目标结果将是:

Zone.A, Zone.B, Zone.C, Zone.D, Zone.E
300.0,  300.0,  300.0,  300.0,  300.0
40.0,   100.0,  100.0,  67.8,   98.2
10.0,   20.0,   1.00,   23.0,   34.5

The first row lies outside of the available rows in the Loading table so it is calculated but the other rows of Indicators have values contained in the Loading table so those rows are just looked up. Thanks for any help you can provide. R often confuses me with its iteration and vector operations.

第一行位于Loading表中的可用行之外,因此计算它但其他Indicators行具有Loading表中包含的值,因此只需查找这些行。感谢您的任何帮助,您可以提供。 R常常让我对它的迭代和向量操作感到困惑。

1 个解决方案

#1


1  

This seems to work:

这似乎有效:

z <- merge(Indicators,Loading,by.x="Val.A",by.y="Index",all.x=T)
z[is.na(z$Zone.A),4:8] <- Loading[nrow(Loading),2:6]*z[is.na(z$Zone.A),]$Val.A
z
#   Val.A Row Val.B Zone.A Zone.B Zone.C Zone.D Zone.E
# 1     1   3   100     10     20      1   23.0   34.5
# 2     3   2    40     40    100    100   67.8   98.2
# 3    30   1    20    300    300    300  300.0  300.0

The basic idea is to merge Loadings into Indicators using Indicators$Val.A and Loading$Index, keeping all columns from Indicators. Absent a match, Zone.A - Zone.E in the result will be NA. So now we select only those rows with Zone.A=NA and fill using your second rule.

基本思路是使用指标$ Val.A和加载$ Index将加载合并为指标,保留指标的所有列。如果没有匹配,则结果中的Zone.A - Zone.E将为NA。所以现在我们只选择Zone.A = NA的那些行,并使用你的第二个规则填充。

This does assume the Loadings is sorted on Index (so the last row has max(Index)).

这确实假设Loadings在Index上排序(所以最后一行有max(Index))。

#1


1  

This seems to work:

这似乎有效:

z <- merge(Indicators,Loading,by.x="Val.A",by.y="Index",all.x=T)
z[is.na(z$Zone.A),4:8] <- Loading[nrow(Loading),2:6]*z[is.na(z$Zone.A),]$Val.A
z
#   Val.A Row Val.B Zone.A Zone.B Zone.C Zone.D Zone.E
# 1     1   3   100     10     20      1   23.0   34.5
# 2     3   2    40     40    100    100   67.8   98.2
# 3    30   1    20    300    300    300  300.0  300.0

The basic idea is to merge Loadings into Indicators using Indicators$Val.A and Loading$Index, keeping all columns from Indicators. Absent a match, Zone.A - Zone.E in the result will be NA. So now we select only those rows with Zone.A=NA and fill using your second rule.

基本思路是使用指标$ Val.A和加载$ Index将加载合并为指标,保留指标的所有列。如果没有匹配,则结果中的Zone.A - Zone.E将为NA。所以现在我们只选择Zone.A = NA的那些行,并使用你的第二个规则填充。

This does assume the Loadings is sorted on Index (so the last row has max(Index)).

这确实假设Loadings在Index上排序(所以最后一行有max(Index))。