I want to end a pipe with an assignment operator in R.
我想在R中使用赋值运算符结束管道。
my goal (in pseudo R):
我的目标(在伪R中):
data %>% analysis functions %>% analyzedData
where data and analyzedData are both a data.frame.
其中data和analyzeData都是data.frame。
I've tried a few variants of this, each giving a unique error message. some iterations I've tried:
我尝试了一些这方面的变种,每个都给出了一个独特的错误信息。我试过的一些迭代:
data %>% analysis functions %>% -> analyzedData
data %>% analysis functions %>% .-> analyzedData
data %>% analysis functions %>% <-. analyzedData
data %>% analysis functions %>% <- analyzedData
Error messages:
错误消息:
Error in function_list[[k]](value) :
could not find function "analyzedData"
Error: object 'analyzedData' not found
Error: unexpected assignment in: ..
Update: the way I figured out to do this is:
更新:我想出这样做的方式是:
data %>% do analysis %>% {.} -> analyzedData
This way, to troubleshoot / debug a long pipe, you can drop these two line into your pipe to minimize code rerun and to isolate the problem.
这样,要对长管道进行故障排除/调试,可以将这两行放入管道中,以最大限度地减少代码重新运行并隔离问题。
data %>% pipeline functions %>%
{.}-> tempWayPoint
tmpWayPoint %>%
more pipeline functions %>% {.} -> endPipe
4 个解决方案
#1
8
It's probably easiest to do the assignment as the first thing (like scoa mentions) but if you really want to put it at the end you could use assign
作为第一件事(如scoa提及),这可能是最简单的做法,但如果你真的想把它放在最后,你可以使用assign
mtcars %>%
group_by(cyl) %>%
summarize(m = mean(hp)) %>%
assign("bar", .)
which will store the output into "bar"
将输出存储到“bar”
Alternatively you could just use the ->
operator. You mention it in your question but it looks like you use something like
或者你可以使用 - >运算符。你在问题中提到它,但看起来你喜欢使用类似的东西
mtcars %>% -> yourvariable
instead of
代替
mtcars -> yourvariable
You don't want to have %>%
in front of the ->
您不希望在 - >前面有%>%
#2
6
It looks like you're trying to decorate the %>%
pipeline operator with the side-effect of creating a new object. One would assume that you could use the assignment operator ->
for this, but it won't work in a pipeline. This is because ->
has lower precedence than user-defined operators like %>%
, which messes up the parsing: your pipeline will be parsed as (initial_stages) -> (final_stages)
which is nonsensical.
看起来您正在尝试使用创建新对象的副作用来装饰%>%管道运算符。可以假设你可以使用赋值运算符 - >为此,但它不能在管道中工作。这是因为 - >的优先级低于用户定义的运算符,如%>%,这会影响解析:您的管道将被解析为(initial_stages) - >(final_stages),这是无意义的。
A solution is to replace ->
with a user-defined version. While we're at it, we might as well use the lazyeval
package, to ensure it will create the object where it's supposed to go:
解决方案是用用户定义的版本替换 - >。虽然我们正在使用它,但我们不妨使用lazyeval包,以确保它将创建它应该去的对象:
`%->%` <- function(value, x)
{
x <- lazyeval::lazy(x)
assign(deparse(x$expr), value, x$env)
value
}
An example of this in use:
使用中的一个例子:
smry <- mtcars %>%
group_by(cyl) %->% # ->, not >
tmp %>%
summarise(m=mean(mpg))
tmp
#Source: local data frame [32 x 11]
#Groups: cyl
#
# mpg cyl disp hp drat wt qsec vs am gear carb
#1 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4
#2 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4
#3 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1
#4 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1
#5 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2
#.. ... ... ... ... ... ... ... .. .. ... ...
smry
#Source: local data frame [3 x 2]
#
# cyl m
#1 4 26.66364
#2 6 19.74286
#3 8 15.10000
#3
4
You can think of a pipe chain as a multiline function, that works as every other multiline function. The usual way to save the output is to assign it on the first line :
您可以将管道链视为多线功能,与其他所有多线功能一样。保存输出的常用方法是在第一行分配它:
analyzedData <- data %>% analysis functions
Like you would do :
就像你会做的那样:
plot <- ggplot(data,aes(x=x,y=x)) +
geom_point()
#4
2
Update: the way I figured out to do this is: data %>% do analysis %>% {.} -> analyzedData
更新:我想出的方法是:数据%>%做分析%>%{。} - > analyzeData
This way, to troubleshoot / debug a long pipe, you can drop these two line into your pipe to minimize code rerun and to isolate the problem.
这样,要对长管道进行故障排除/调试,可以将这两行放入管道中,以最大限度地减少代码重新运行并隔离问题。
data %>% pipeline functions %>%
{.}-> tempWayPoint
tmpWayPoint %>%
more pipeline functions %>% {.} -> endPipe
If you have a better way of doing this please let me know.
如果您有更好的方法,请告诉我。
#1
8
It's probably easiest to do the assignment as the first thing (like scoa mentions) but if you really want to put it at the end you could use assign
作为第一件事(如scoa提及),这可能是最简单的做法,但如果你真的想把它放在最后,你可以使用assign
mtcars %>%
group_by(cyl) %>%
summarize(m = mean(hp)) %>%
assign("bar", .)
which will store the output into "bar"
将输出存储到“bar”
Alternatively you could just use the ->
operator. You mention it in your question but it looks like you use something like
或者你可以使用 - >运算符。你在问题中提到它,但看起来你喜欢使用类似的东西
mtcars %>% -> yourvariable
instead of
代替
mtcars -> yourvariable
You don't want to have %>%
in front of the ->
您不希望在 - >前面有%>%
#2
6
It looks like you're trying to decorate the %>%
pipeline operator with the side-effect of creating a new object. One would assume that you could use the assignment operator ->
for this, but it won't work in a pipeline. This is because ->
has lower precedence than user-defined operators like %>%
, which messes up the parsing: your pipeline will be parsed as (initial_stages) -> (final_stages)
which is nonsensical.
看起来您正在尝试使用创建新对象的副作用来装饰%>%管道运算符。可以假设你可以使用赋值运算符 - >为此,但它不能在管道中工作。这是因为 - >的优先级低于用户定义的运算符,如%>%,这会影响解析:您的管道将被解析为(initial_stages) - >(final_stages),这是无意义的。
A solution is to replace ->
with a user-defined version. While we're at it, we might as well use the lazyeval
package, to ensure it will create the object where it's supposed to go:
解决方案是用用户定义的版本替换 - >。虽然我们正在使用它,但我们不妨使用lazyeval包,以确保它将创建它应该去的对象:
`%->%` <- function(value, x)
{
x <- lazyeval::lazy(x)
assign(deparse(x$expr), value, x$env)
value
}
An example of this in use:
使用中的一个例子:
smry <- mtcars %>%
group_by(cyl) %->% # ->, not >
tmp %>%
summarise(m=mean(mpg))
tmp
#Source: local data frame [32 x 11]
#Groups: cyl
#
# mpg cyl disp hp drat wt qsec vs am gear carb
#1 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4
#2 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4
#3 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1
#4 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1
#5 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2
#.. ... ... ... ... ... ... ... .. .. ... ...
smry
#Source: local data frame [3 x 2]
#
# cyl m
#1 4 26.66364
#2 6 19.74286
#3 8 15.10000
#3
4
You can think of a pipe chain as a multiline function, that works as every other multiline function. The usual way to save the output is to assign it on the first line :
您可以将管道链视为多线功能,与其他所有多线功能一样。保存输出的常用方法是在第一行分配它:
analyzedData <- data %>% analysis functions
Like you would do :
就像你会做的那样:
plot <- ggplot(data,aes(x=x,y=x)) +
geom_point()
#4
2
Update: the way I figured out to do this is: data %>% do analysis %>% {.} -> analyzedData
更新:我想出的方法是:数据%>%做分析%>%{。} - > analyzeData
This way, to troubleshoot / debug a long pipe, you can drop these two line into your pipe to minimize code rerun and to isolate the problem.
这样,要对长管道进行故障排除/调试,可以将这两行放入管道中,以最大限度地减少代码重新运行并隔离问题。
data %>% pipeline functions %>%
{.}-> tempWayPoint
tmpWayPoint %>%
more pipeline functions %>% {.} -> endPipe
If you have a better way of doing this please let me know.
如果您有更好的方法,请告诉我。