Here is an example of the problem I am trying to solve.
这是我试图解决的问题的一个例子。
I have a data frame
我有一个数据框
x<-matrix(1:9, nrow= 3, ncol= 3)
y<-c(1,100,2)
z<-c("A","B","C","D")
df<-cbind(x,y)
colnames(df)<-z
A B C D
[1,] 1 4 7 1
[2,] 2 5 8 100
[3,] 3 6 9 2
For each row, I want to calculate distance between each of the values in A,B, and C with D to find the two values from A, B and C closest to D.
对于每一行,我想计算A,B和C中每个值与D之间的距离,以找到最接近D的A,B和C中的两个值。
This was my latest attempt.
这是我最近的尝试。
test<-function(x,y,z){
d<-abs(x-y)
df<-data.frame(z,d)
df<-df[order(d),]
d<-c(df[1:2,1])
d<-paste(d[1],"-",d[2],sep="")
}
results<adply(test, 1, transform, res = test(
c("A","B","C"),D,1:3]))
This is the error I am getting
这是我得到的错误
Error in splitter_a(.data, .margins, .expand, .id) : Invalid margin
I am wanting the result to be a data frame like this:
我希望结果是这样的数据框:
A B C D res
[1,] 1 4 7 25 A-B
[2,] 2 5 8 26 C-B
[3,] 3 6 9 27 A-B
Any help provided is greatly appreciated.
非常感谢您提供的任何帮助。
NT
Edit - The suggested answer worked in my test case, but does not work when translated to me real scenario. Here is a sample of the DF
编辑 - 建议的答案适用于我的测试用例,但在转换为真实场景时不起作用。这是DF的样本
0% 25% 50% 75% 100% target
1 350.00 350.0000 380.610 380.6100 416.25 425.0
2 350.00 350.0000 350.000 350.0000 350.00 425.0
3 223.83 383.6800 414.890 472.3050 529.20 425.0
4 442.36 442.9625 443.565 444.1675 444.77 472.8
5 466.00 525.4800 529.200 529.2000 529.20 465.6
6 350.00 357.1650 364.330 371.4950 378.66 513.6
This is how the script translates to me scenario
这是脚本转换为我的方案
apply(DF, 1, function(x){
paste(c("0%","25%","50%","75%","100%")[order(abs(x[c(columns[1:5])] - x["target"]))][1:2], collapse = "-")
})
I am getting the following error:
我收到以下错误:
Error in x[c(columns[5:9])] - x[target] :
non-numeric argument to binary operator
I have confirmed the data values are numeric
我已确认数据值是数字
1 个解决方案
#1
2
apply(df, 1, function(x){
paste(c("A", "B", "C")[order(abs(x[c("A", "B", "C")] - x["D"]))][1:2], collapse = "-")
})
#[1] "A-B" "C-B" "A-B"
UPDATE
#DATA
df = read.table(strip.white = TRUE,
stringsAsFactors = FALSE,
header = TRUE,
check.names = FALSE,
text = "0% 25% 50% 75% 100% target
1 350.00 350.0000 380.610 380.6100 416.25 425.0
2 350.00 350.0000 350.000 350.0000 350.00 425.0
3 223.83 383.6800 414.890 472.3050 529.20 425.0
4 442.36 442.9625 443.565 444.1675 444.77 472.8
5 466.00 525.4800 529.200 529.2000 529.20 465.6
6 350.00 357.1650 364.330 371.4950 378.66 513.6")
apply(df, 1, function(x){
paste(names(x)[order(abs(x[1:5] - x[6]))][1:2], collapse = "-")
})
# 1 2 3 4 5 6
#"100%-50%" "0%-25%" "50%-25%" "100%-75%" "0%-25%" "100%-75%"
#1
2
apply(df, 1, function(x){
paste(c("A", "B", "C")[order(abs(x[c("A", "B", "C")] - x["D"]))][1:2], collapse = "-")
})
#[1] "A-B" "C-B" "A-B"
UPDATE
#DATA
df = read.table(strip.white = TRUE,
stringsAsFactors = FALSE,
header = TRUE,
check.names = FALSE,
text = "0% 25% 50% 75% 100% target
1 350.00 350.0000 380.610 380.6100 416.25 425.0
2 350.00 350.0000 350.000 350.0000 350.00 425.0
3 223.83 383.6800 414.890 472.3050 529.20 425.0
4 442.36 442.9625 443.565 444.1675 444.77 472.8
5 466.00 525.4800 529.200 529.2000 529.20 465.6
6 350.00 357.1650 364.330 371.4950 378.66 513.6")
apply(df, 1, function(x){
paste(names(x)[order(abs(x[1:5] - x[6]))][1:2], collapse = "-")
})
# 1 2 3 4 5 6
#"100%-50%" "0%-25%" "50%-25%" "100%-75%" "0%-25%" "100%-75%"