Thank you in advance for reading this. I have a function which was working just fine on data.table 1.9.3. But today I updated my data.table package and my function does not work.
提前感谢您阅读本文。我有一个在data.table 1.9.3上工作正常的函数。但今天我更新了我的data.table包,我的功能不起作用。
Here is my function and working example on data.table 1.9.3:
这是我在data.table 1.9.3上的函数和工作示例:
trait.by <- function(data,traits="",cross.by){
traits = intersect(traits,names(data))
if(length(traits)<1){
#if there is no intersect between names and traits
return( data[, list(N. = .N), by=cross.by])
}else{
return(data[,c( N. = .N,
MEAN = lapply(.SD,function(x){return(round(mean(x,na.rm=T),digits=1))}) ,
SD = lapply(.SD,function(x){return(round(sd (x,na.rm=T),digits=2))}) ,
'NA' = lapply(.SD,function(x){return(sum (is.na(x)))})),
by=cross.by, .SDcols = traits])
}
}
> trait.by(data.table(iris),traits = c("Sepal.Length", "Sepal.Width"),cross.by="Species")
# Species N. MEAN.Sepal.Length MEAN.Sepal.Width SD.Sepal.Length
#1: setosa 50 5.0 3.4 0.35
#2: versicolor 50 5.9 2.8 0.52
#3: virginica 50 6.6 3.0 0.64
# SD.Sepal.Width NA.Sepal.Length NA.Sepal.Width
#1: 0.38 0 0
#2: 0.31 0 0
#3: 0.32 0 0
The point is MEAN.(traits)
, SD.(traits)
and NA.(traits)
are computed for all columns that I give in traits
variable.
关键是MEAN。(特征),SD。(特征)和NA。(特征)计算我在traits变量中给出的所有列。
When I run this with data.table 1.9.4 I receive the following error:
当我使用data.table 1.9.4运行它时,我收到以下错误:
> trait.by(data.table(iris),traits = c("Sepal.Length", "Sepal.Width"),cross.by="Species")
#Error in assign("..FUN", eval(fun, SDenv, SDenv), SDenv) :
# cannot change value of locked binding for '..FUN'
Any idea how I should fix this?!
知道我应该怎么解决这个问题?!
2 个解决方案
#1
4
Update: This has been fixed now in 1.9.5 in commit 1680. From NEWS:
更新:现在已在修订版1680中修复了1.9.5。来自新闻:
- Fixed a bug in the internal optimisation of
j-expression
with more than onelapply(.SD, function(..) ..)
as illustrated here on SO. Closes #985. Thanks to @jadaliha for the report and to @BrodieG for the debugging on SO.修复了j-expression的内部优化中的一个错误,该错误具有多个lapply(.SD,function(..)..),如SO所示。关闭#985。感谢@jadaliha的报告和@BrodieG的SO调试。
Now this works as expected:
现在这可以按预期工作:
data[,
c(
MEAN = lapply(.SD,function(x){return(round(mean(x,na.rm=T),digits=1))}),
SD = lapply(.SD,function(x){return(round(sd (x,na.rm=T),digits=2))})
), by=cross.by, .SDcols = traits]
This looks like a bug that manifests as a result of multiple uses of lapply(.SD, FUN)
in one data.table
call in combination with c(
. You can work around it by replacing c(
with .(
.
这看起来像一个错误,表现为多次使用lapply(.SD,FUN)在一个data.table调用中结合c(。你可以通过替换c(用。)来解决它。
traits <- c("Sepal.Length", "Sepal.Width")
cross.by <- "Species"
data <- data.table(iris)
data[,
c(
MEAN = lapply(.SD,function(x){return(round(mean(x,na.rm=T),digits=1))})
),
by=cross.by, .SDcols = traits
]
Works.
data[,
c(
SD = lapply(.SD,function(x){return(round(sd (x,na.rm=T),digits=2))})
),
by=cross.by, .SDcols = traits
]
Works.
data[,
c(
MEAN = lapply(.SD,function(x){return(round(mean(x,na.rm=T),digits=1))}),
SD = lapply(.SD,function(x){return(round(sd (x,na.rm=T),digits=2))})
),
by=cross.by, .SDcols = traits
]
Doesn't work
data[,
.(
MEAN = lapply(.SD,function(x){return(round(mean(x,na.rm=T),digits=1))}),
SD = lapply(.SD,function(x){return(round(sd (x,na.rm=T),digits=2))})
),
by=cross.by, .SDcols = traits
]
Works.
#2
2
Like this ? The output format changed slightly. But the result is all there.
喜欢这个 ?输出格式略有改变。但结果就是那里。
trait.by <- function(data,traits="",cross.by){
traits = intersect(traits,names(data))
if(length(traits)<1){
#if there is no intersect between names and traits
return(data[, list(N. = .N), by=cross.by])
}else{
# ** Changes: use list instead of c and don't think we need return here.
# and add new col_Nam with refernce to comments below
return(data[, list(N. = .N,
MEAN = lapply(.SD,function(x){round(mean(x,na.rm=T),digits=1)}) ,
SD = lapply(.SD,function(x){round(sd (x,na.rm=T),digits=2)}) ,
'NA' = lapply(.SD,function(x){sum (is.na(x))}),
col_Nam = names(.SD)),
by=cross.by, .SDcols = traits])
}
}
trait.by(data.table(iris),traits = c("Sepal.Length", "Sepal.Width"),cross.by="Species")
# result
Species N. MEAN SD NA col_Nam
1: setosa 50 5 0.35 0 Sepal.Length
2: setosa 50 3.4 0.38 0 Sepal.Width
3: versicolor 50 5.9 0.52 0 Sepal.Length
4: versicolor 50 2.8 0.31 0 Sepal.Width
5: virginica 50 6.6 0.64 0 Sepal.Length
6: virginica 50 3 0.32 0 Sepal.Width
#1
4
Update: This has been fixed now in 1.9.5 in commit 1680. From NEWS:
更新:现在已在修订版1680中修复了1.9.5。来自新闻:
- Fixed a bug in the internal optimisation of
j-expression
with more than onelapply(.SD, function(..) ..)
as illustrated here on SO. Closes #985. Thanks to @jadaliha for the report and to @BrodieG for the debugging on SO.修复了j-expression的内部优化中的一个错误,该错误具有多个lapply(.SD,function(..)..),如SO所示。关闭#985。感谢@jadaliha的报告和@BrodieG的SO调试。
Now this works as expected:
现在这可以按预期工作:
data[,
c(
MEAN = lapply(.SD,function(x){return(round(mean(x,na.rm=T),digits=1))}),
SD = lapply(.SD,function(x){return(round(sd (x,na.rm=T),digits=2))})
), by=cross.by, .SDcols = traits]
This looks like a bug that manifests as a result of multiple uses of lapply(.SD, FUN)
in one data.table
call in combination with c(
. You can work around it by replacing c(
with .(
.
这看起来像一个错误,表现为多次使用lapply(.SD,FUN)在一个data.table调用中结合c(。你可以通过替换c(用。)来解决它。
traits <- c("Sepal.Length", "Sepal.Width")
cross.by <- "Species"
data <- data.table(iris)
data[,
c(
MEAN = lapply(.SD,function(x){return(round(mean(x,na.rm=T),digits=1))})
),
by=cross.by, .SDcols = traits
]
Works.
data[,
c(
SD = lapply(.SD,function(x){return(round(sd (x,na.rm=T),digits=2))})
),
by=cross.by, .SDcols = traits
]
Works.
data[,
c(
MEAN = lapply(.SD,function(x){return(round(mean(x,na.rm=T),digits=1))}),
SD = lapply(.SD,function(x){return(round(sd (x,na.rm=T),digits=2))})
),
by=cross.by, .SDcols = traits
]
Doesn't work
data[,
.(
MEAN = lapply(.SD,function(x){return(round(mean(x,na.rm=T),digits=1))}),
SD = lapply(.SD,function(x){return(round(sd (x,na.rm=T),digits=2))})
),
by=cross.by, .SDcols = traits
]
Works.
#2
2
Like this ? The output format changed slightly. But the result is all there.
喜欢这个 ?输出格式略有改变。但结果就是那里。
trait.by <- function(data,traits="",cross.by){
traits = intersect(traits,names(data))
if(length(traits)<1){
#if there is no intersect between names and traits
return(data[, list(N. = .N), by=cross.by])
}else{
# ** Changes: use list instead of c and don't think we need return here.
# and add new col_Nam with refernce to comments below
return(data[, list(N. = .N,
MEAN = lapply(.SD,function(x){round(mean(x,na.rm=T),digits=1)}) ,
SD = lapply(.SD,function(x){round(sd (x,na.rm=T),digits=2)}) ,
'NA' = lapply(.SD,function(x){sum (is.na(x))}),
col_Nam = names(.SD)),
by=cross.by, .SDcols = traits])
}
}
trait.by(data.table(iris),traits = c("Sepal.Length", "Sepal.Width"),cross.by="Species")
# result
Species N. MEAN SD NA col_Nam
1: setosa 50 5 0.35 0 Sepal.Length
2: setosa 50 3.4 0.38 0 Sepal.Width
3: versicolor 50 5.9 0.52 0 Sepal.Length
4: versicolor 50 2.8 0.31 0 Sepal.Width
5: virginica 50 6.6 0.64 0 Sepal.Length
6: virginica 50 3 0.32 0 Sepal.Width