What are the benefits of using with()
? In the help file it mentions it evaluates the expression in an environment it creates from the data. What are the benefits of this? Is it faster to create an environment and evaluate it in there as opposed to just evaluating it in the global environment? Or is there something else I'm missing?
使用with()有什么好处?在帮助文件中,它提到它在从数据创建的环境中计算表达式。这有什么好处呢?创建环境并在其中进行评估是否比在全局环境中进行评估更快?还是我还缺点什么?
2 个解决方案
#1
8
with
is a wrapper for functions with no data
argument
There are many functions that work on data frames and take a data
argument so that you don't need to retype the name of the data frame for every time you reference a column. lm
, plot.formula
, subset
, transform
are just a few examples.
有许多函数可以在数据帧上工作,并获取一个数据参数,以便在每次引用一个列时不需要重新键入数据帧的名称。lm,阴谋。公式,子集,变换只是几个例子。
with
is a general purpose wrapper to let you use any function as if it had a data argument.
with是一个通用的包装器,可以让您像使用数据参数一样使用任何函数。
Using the mtcars
data set, we could fit a model with or without using the data argument:
使用mtcars数据集,我们可以使用或不使用数据参数来拟合模型:
# this is obviously annoying
mod = lm(mtcars$mpg ~ mtcars$cyl + mtcars$disp + mtcars$wt)
# this is nicer
mod = lm(mpg ~ cyl + disp + wt, data = mtcars)
However, if (for some strange reason) we wanted to find the mean
of cyl + disp + wt
, there is a problem because mean
doesn't have a data argument like lm
does. This is the issue that with
addresses:
然而,如果(出于某种奇怪的原因)我们想找到cyl + disp + wt的均值,就会有问题,因为mean没有像lm这样的数据参数。这是关于地址的问题:
# without with(), we would be stuck here:
z = mean(mtcars$cyl + mtcars$disp + mtcars$wt)
# using with(), we can clean this up:
z = with(mtcars, mean(cyl + disp + wt))
Wrapping foo()
in with(data, foo(...))
lets us use any function foo
as if it had a data
argument - which is to say we can use unquoted column names, preventing repetitive data_name$column_name
or data_name[, "column_name"]
.
用(data, foo(…))包装foo()让我们可以使用任何函数foo,就好像它有一个数据参数一样——也就是说我们可以使用未引用的列名,防止重复的data_name$column_name或data_name["column_name"]。
When to use with
Use with
in interactively and in your R scripts to save typing and make your code clearer. The more frequently you would need to re-type your data frame name for a single command (and the longer your data frame name is!), the greater the benefit of using with
.
在交互式和R脚本中使用with保存输入并使代码更清晰。您需要更频繁地为单个命令重新键入您的数据帧名称(并且您的数据帧名称越长!),使用它的好处就越大。
Also note that with
isn't limited to data frames. From ?with
:
还要注意,with并不仅限于数据帧。与:?
For the default
with
method this may be an environment, a list, a data frame, or an integer as insys.call
.对于默认的with方法,这可能是一个环境、一个列表、一个数据框或一个整数,如sy .call中所示。
I don't often work with environments, but when I do I find with
very handy.
我不经常与环境打交道,但当我这样做时,我发现非常方便。
When you need pieces of a result for one line only
As @Rich Scriven suggests in comments, with
can be very useful when you need to use the results of something like rle
. If you only need the results once, then his example with(rle(data), lengths[values > 1])
lets you use the rle(data)
results anonymously.
正如@Rich Scriven在评论中建议的,当你需要使用rle之类的结果时,with是非常有用的。如果您只需要一次结果,那么他的示例(rle(data)、length[值> 1])允许您匿名使用rle(data)结果。
When to avoid with
When there is a data
argument
Many functions that have a data
argument use it for more than just easier syntax when you call it. Most modeling functions (like lm
), and many others too (ggplot
!) do a lot with the provided data
. If you use with
instead of a data
argument, you'll limit the features available to you. If there is a data
argument, use the data
argument, not with
.
许多具有数据参数的函数在调用它时不仅使用更简单的语法。大多数建模函数(如lm)和许多其他函数(ggplot!)都对提供的数据做了大量工作。如果您使用with而不是data参数,您将限制对您可用的特性。如果有数据参数,使用数据参数,而不是with。
Adding to the environment
In my example above, the result was assigned to the global environment (bar = with(...)
). To make an assignment inside the list/environment/data, you can use within
. (In the case of data.frames
, transform
is also good.)
在上面的示例中,结果被分配给全局环境(bar = with(…))。要在列表/环境/数据中进行赋值,可以在列表中使用。(对于data.frame,转换也很好。)
In packages
Don't use with
in R packages. There is a warning in help(subset)
that could apply just about as well to with
:
不要在R包中使用。在help(子集)中有一个警告可以应用于以下方面:
Warning This is a convenience function intended for use interactively. For programming it is better to use the standard subsetting functions like
[
, and in particular the non-standard evaluation of argument subset can have unanticipated consequences.警告这是一个方便的功能,用于交互使用。对于编程来说,最好使用像[这样的标准子设置函数,特别是参数子集的非标准评估可能会产生预期之外的结果。
If you build an R package using with
, when you check it you will probably get warnings or notes about using variables without a visible binding. This will make the package unacceptable by CRAN.
如果您使用with构建一个R包,当您检查它时,您可能会得到关于使用没有可见绑定的变量的警告或注意。这将使CRAN无法接受这个包。
Alternatives to with
Don't use attach
Many (mostly dated) R tutorials use attach
to avoid re-typing data frame names by making columns accessible to the global environment. attach
is widely considered to be bad practice and should be avoided. One of the main dangers of attach is that data columns can become out of sync if they are modified individually. with
avoids this pitfall because it is invoked one expression at a time. There are many, many questions on Stack Overflow where new users are following an old tutorial and run in to problems because of attach
. The easy solution is always don't use attach
.
许多(主要是过时的)R教程使用attach来避免通过使列可访问到全局环境来重新输入数据帧名。人们普遍认为“附加”是不好的做法,应该避免。attach的主要危险之一是,如果单独修改数据列,数据列可能会不同步。避免这个陷阱,因为它一次只调用一个表达式。在Stack Overflow上有很多问题,新用户正在遵循一个旧的教程,并且由于attach而运行到问题中。简单的解决方案是不要使用附加。
Using with
all the time seems too repetitive
If you are doing many steps of data manipulation, you may find yourself beginning every line of code with with(my_data, ...
. You might think this repetition is almost as bad as not using with
. Both the data.table
and dplyr
packages offer efficient data manipulation with non-repetitive syntax. I'd encourage you to learn to use one of them. Both have excellent documentation.
如果你在做许多数据操作的步骤,你会发现自己开始与与每一行代码(my_data,....你可能认为这种重复几乎和不使用一样糟糕。这两个数据。表和dplyr包提供了高效的数据操作和非重复语法。我鼓励你学会使用其中的一种。都有很好的文档。
#2
5
I use it when i don't want to keep typing dataframe$
. For example
当我不想继续输入dataframe$时,我会使用它。例如
with(mtcars, plot(wt, qsec))
rather than
而不是
plot(mtcars$wt, mtcars$qsec)
The former looks up wt
and qsec
in the mtcars
data.frame. Of course
前者在mtcars data.frame中查找wt和qsec。当然
plot(qsec~wt, mtcars)
is more appropriate for plot or other functions that take a data=
argument.
更适合于plot或其他函数,该函数使用data=参数。
#1
8
with
is a wrapper for functions with no data
argument
There are many functions that work on data frames and take a data
argument so that you don't need to retype the name of the data frame for every time you reference a column. lm
, plot.formula
, subset
, transform
are just a few examples.
有许多函数可以在数据帧上工作,并获取一个数据参数,以便在每次引用一个列时不需要重新键入数据帧的名称。lm,阴谋。公式,子集,变换只是几个例子。
with
is a general purpose wrapper to let you use any function as if it had a data argument.
with是一个通用的包装器,可以让您像使用数据参数一样使用任何函数。
Using the mtcars
data set, we could fit a model with or without using the data argument:
使用mtcars数据集,我们可以使用或不使用数据参数来拟合模型:
# this is obviously annoying
mod = lm(mtcars$mpg ~ mtcars$cyl + mtcars$disp + mtcars$wt)
# this is nicer
mod = lm(mpg ~ cyl + disp + wt, data = mtcars)
However, if (for some strange reason) we wanted to find the mean
of cyl + disp + wt
, there is a problem because mean
doesn't have a data argument like lm
does. This is the issue that with
addresses:
然而,如果(出于某种奇怪的原因)我们想找到cyl + disp + wt的均值,就会有问题,因为mean没有像lm这样的数据参数。这是关于地址的问题:
# without with(), we would be stuck here:
z = mean(mtcars$cyl + mtcars$disp + mtcars$wt)
# using with(), we can clean this up:
z = with(mtcars, mean(cyl + disp + wt))
Wrapping foo()
in with(data, foo(...))
lets us use any function foo
as if it had a data
argument - which is to say we can use unquoted column names, preventing repetitive data_name$column_name
or data_name[, "column_name"]
.
用(data, foo(…))包装foo()让我们可以使用任何函数foo,就好像它有一个数据参数一样——也就是说我们可以使用未引用的列名,防止重复的data_name$column_name或data_name["column_name"]。
When to use with
Use with
in interactively and in your R scripts to save typing and make your code clearer. The more frequently you would need to re-type your data frame name for a single command (and the longer your data frame name is!), the greater the benefit of using with
.
在交互式和R脚本中使用with保存输入并使代码更清晰。您需要更频繁地为单个命令重新键入您的数据帧名称(并且您的数据帧名称越长!),使用它的好处就越大。
Also note that with
isn't limited to data frames. From ?with
:
还要注意,with并不仅限于数据帧。与:?
For the default
with
method this may be an environment, a list, a data frame, or an integer as insys.call
.对于默认的with方法,这可能是一个环境、一个列表、一个数据框或一个整数,如sy .call中所示。
I don't often work with environments, but when I do I find with
very handy.
我不经常与环境打交道,但当我这样做时,我发现非常方便。
When you need pieces of a result for one line only
As @Rich Scriven suggests in comments, with
can be very useful when you need to use the results of something like rle
. If you only need the results once, then his example with(rle(data), lengths[values > 1])
lets you use the rle(data)
results anonymously.
正如@Rich Scriven在评论中建议的,当你需要使用rle之类的结果时,with是非常有用的。如果您只需要一次结果,那么他的示例(rle(data)、length[值> 1])允许您匿名使用rle(data)结果。
When to avoid with
When there is a data
argument
Many functions that have a data
argument use it for more than just easier syntax when you call it. Most modeling functions (like lm
), and many others too (ggplot
!) do a lot with the provided data
. If you use with
instead of a data
argument, you'll limit the features available to you. If there is a data
argument, use the data
argument, not with
.
许多具有数据参数的函数在调用它时不仅使用更简单的语法。大多数建模函数(如lm)和许多其他函数(ggplot!)都对提供的数据做了大量工作。如果您使用with而不是data参数,您将限制对您可用的特性。如果有数据参数,使用数据参数,而不是with。
Adding to the environment
In my example above, the result was assigned to the global environment (bar = with(...)
). To make an assignment inside the list/environment/data, you can use within
. (In the case of data.frames
, transform
is also good.)
在上面的示例中,结果被分配给全局环境(bar = with(…))。要在列表/环境/数据中进行赋值,可以在列表中使用。(对于data.frame,转换也很好。)
In packages
Don't use with
in R packages. There is a warning in help(subset)
that could apply just about as well to with
:
不要在R包中使用。在help(子集)中有一个警告可以应用于以下方面:
Warning This is a convenience function intended for use interactively. For programming it is better to use the standard subsetting functions like
[
, and in particular the non-standard evaluation of argument subset can have unanticipated consequences.警告这是一个方便的功能,用于交互使用。对于编程来说,最好使用像[这样的标准子设置函数,特别是参数子集的非标准评估可能会产生预期之外的结果。
If you build an R package using with
, when you check it you will probably get warnings or notes about using variables without a visible binding. This will make the package unacceptable by CRAN.
如果您使用with构建一个R包,当您检查它时,您可能会得到关于使用没有可见绑定的变量的警告或注意。这将使CRAN无法接受这个包。
Alternatives to with
Don't use attach
Many (mostly dated) R tutorials use attach
to avoid re-typing data frame names by making columns accessible to the global environment. attach
is widely considered to be bad practice and should be avoided. One of the main dangers of attach is that data columns can become out of sync if they are modified individually. with
avoids this pitfall because it is invoked one expression at a time. There are many, many questions on Stack Overflow where new users are following an old tutorial and run in to problems because of attach
. The easy solution is always don't use attach
.
许多(主要是过时的)R教程使用attach来避免通过使列可访问到全局环境来重新输入数据帧名。人们普遍认为“附加”是不好的做法,应该避免。attach的主要危险之一是,如果单独修改数据列,数据列可能会不同步。避免这个陷阱,因为它一次只调用一个表达式。在Stack Overflow上有很多问题,新用户正在遵循一个旧的教程,并且由于attach而运行到问题中。简单的解决方案是不要使用附加。
Using with
all the time seems too repetitive
If you are doing many steps of data manipulation, you may find yourself beginning every line of code with with(my_data, ...
. You might think this repetition is almost as bad as not using with
. Both the data.table
and dplyr
packages offer efficient data manipulation with non-repetitive syntax. I'd encourage you to learn to use one of them. Both have excellent documentation.
如果你在做许多数据操作的步骤,你会发现自己开始与与每一行代码(my_data,....你可能认为这种重复几乎和不使用一样糟糕。这两个数据。表和dplyr包提供了高效的数据操作和非重复语法。我鼓励你学会使用其中的一种。都有很好的文档。
#2
5
I use it when i don't want to keep typing dataframe$
. For example
当我不想继续输入dataframe$时,我会使用它。例如
with(mtcars, plot(wt, qsec))
rather than
而不是
plot(mtcars$wt, mtcars$qsec)
The former looks up wt
and qsec
in the mtcars
data.frame. Of course
前者在mtcars data.frame中查找wt和qsec。当然
plot(qsec~wt, mtcars)
is more appropriate for plot or other functions that take a data=
argument.
更适合于plot或其他函数,该函数使用data=参数。