I am trying to apply a IDW (inverse distance weighting) to different groups in a database. I am trying to use dplyr to apply this function to each group, but i am making a mistake in the Split-Apply-Combine. The current function returns 10 values for each group of 10 observations, but currently dplyr tries to insert 10 return values in each mutated cell, rather than one new value for mutated cell.
我正在尝试将IDW(反距离加权)应用于数据库中的不同组。我正在尝试使用dplyr将此功能应用于每个组,但我在Split-Apply-Combine中犯了一个错误。当前函数为每组10个观察值返回10个值,但是当前dplyr尝试在每个突变单元格中插入10个返回值,而不是为突变单元格插入一个新值。
The problem is likely function-agnostic, but i could unfortunately not find a simpler function that showcases the same error.
这个问题可能与函数无关,但遗憾的是我找不到一个更简单的函数来展示同样的错误。
I get the error message that the dataframe is corrupt, and the new column is filled with values.
我收到数据框已损坏的错误消息,并且新列填充了值。
group N Lat Long Obs idw_val
1 A 1 49.43952 20.42646 11 <dbl[10]>
2 B 1 49.76982 19.70493 8 <dbl[10]>
The example hopefully clarifies this. The solution is probably very simple - some pointers to help me much appreciated...
这个例子有希望澄清这一点。解决方案可能非常简单 - 一些指示,以帮助我非常感谢...
require(ggmap)
require(dplyr)
require(raster)
require(sp)
require(gstat)
require(lattice)
####create dataset
set.seed(123)
dh = expand.grid(group = c("A","B","C"),
N=1:10)
dh$Lat=rnorm(nrow(dh),50,1)
dh$Long=rnorm(nrow(dh),20,1)
dh$Obs=rpois(nrow(dh),10)
dh
#####create grid
pixels <- 10
#####function defintion
idw_w=function(x,y,z){
geog2 <- data.frame(x,y,z)
coordinates(geog2) = ~x+y
geog.grd <- expand.grid(x=seq(floor(min(coordinates(geog2)[,1])),
ceiling(max(coordinates(geog2)[,1])),
length.out=pixels),
y=seq(floor(min(coordinates(geog2)[,2])),
ceiling(max(coordinates(geog2)[,2])),
length.out=pixels))
# Assigning coordinates results in spdataframe.
grd.pts <- SpatialPixels(SpatialPoints((geog.grd)))
grd <- as(grd.pts, "SpatialGrid")
##### IDW interpolation.
geog2.idw <- idw(z ~ 1, geog2, grd, idp=4)
####overlay
pts <- SpatialPoints(cbind(x, y))
over(pts, geog2.idw["var1.pred"])
}
#### test function
idw_w(dh$Lat,dh$Long,dh$Obs)
####groupwise dplyr
dh2 = dh %>%
# arrange(Block, Species, Date) %>%
group_by(group) %>%
mutate(idw_val=idw_w(x=Lat,y=Long,z=Obs))
dh2
str(dh2)
1 个解决方案
#1
If I understand what you want correctly it's just a matter of making sure your function returns a vector of values rather than a data.frame
object. I think this function will do what you want when run through the mutate()
step:
如果我理解你想要的正确,那只需要确保你的函数返回值向量而不是data.frame对象。我认为当通过mutate()步骤运行时,此函数将执行您想要的操作:
idw_w=function(x,y,z){
geog2 <- data.frame(x,y,z)
coordinates(geog2) = ~x+y
geog.grd <- expand.grid(x=seq(floor(min(coordinates(geog2)[,1])),
ceiling(max(coordinates(geog2)[,1])),
length.out=pixels),
y=seq(floor(min(coordinates(geog2)[,2])),
ceiling(max(coordinates(geog2)[,2])),
length.out=pixels))
# Assigning coordinates results in spdataframe.
grd.pts <- SpatialPixels(SpatialPoints((geog.grd)))
grd <- as(grd.pts, "SpatialGrid")
##### IDW interpolation.
geog2.idw <- idw(z ~ 1, geog2, grd, idp=4)
####overlay
pts <- SpatialPoints(cbind(x, y))
(over(pts, geog2.idw["var1.pred"]))[,1]
}
#1
If I understand what you want correctly it's just a matter of making sure your function returns a vector of values rather than a data.frame
object. I think this function will do what you want when run through the mutate()
step:
如果我理解你想要的正确,那只需要确保你的函数返回值向量而不是data.frame对象。我认为当通过mutate()步骤运行时,此函数将执行您想要的操作:
idw_w=function(x,y,z){
geog2 <- data.frame(x,y,z)
coordinates(geog2) = ~x+y
geog.grd <- expand.grid(x=seq(floor(min(coordinates(geog2)[,1])),
ceiling(max(coordinates(geog2)[,1])),
length.out=pixels),
y=seq(floor(min(coordinates(geog2)[,2])),
ceiling(max(coordinates(geog2)[,2])),
length.out=pixels))
# Assigning coordinates results in spdataframe.
grd.pts <- SpatialPixels(SpatialPoints((geog.grd)))
grd <- as(grd.pts, "SpatialGrid")
##### IDW interpolation.
geog2.idw <- idw(z ~ 1, geog2, grd, idp=4)
####overlay
pts <- SpatialPoints(cbind(x, y))
(over(pts, geog2.idw["var1.pred"]))[,1]
}