R中“data.frame的标准公式接口”是什么意思?

时间:2022-11-18 16:55:18

The documentation for aggregate states:

汇总状态的文档:

‘aggregate.formula’ is a standard formula interface to ‘aggregate.data.frame’.

'aggregate.formula'是'aggregate.data.frame'的标准公式接口。

I am new to R, and I don't understand what this means. Please explain!

我是R的新手,我不明白这意味着什么。请解释!

Thanks!

Uri

1 个解决方案

#1


10  

Jump to the middle of the examples section of help(aggregate) and you will see this:

跳到帮助(聚合)的示例部分的中间,您将看到:

 ## Formulas, one ~ one, one ~ many, many ~ one, and many ~ many:
 aggregate(weight ~ feed, data = chickwts, mean)
 aggregate(breaks ~ wool + tension, data = warpbreaks, mean)
 aggregate(cbind(Ozone, Temp) ~ Month, data = airquality, mean)
 aggregate(cbind(ncases, ncontrols) ~ alcgp + tobgp, data = esoph, sum)

Four different calls to aggregate(), all using the formula interface. The way it is written above in what you quote has to do with method dispatching mechanism used throughout R.

四个不同的聚合调用(),都使用公式接口。上面用你引用的内容编写的方式与整个R中使用的方法调度机制有关。

Consider the first example:

考虑第一个例子:

R> class(weight ~ feed)
[1] "formula"
R> class(chickwts)
[1] "data.frame"

so aggregate dispatches on it first argument (of class formula). The way a formula gets resolved in R typically revolves around a model.matrix, I presume something similar happens here and an equivalent call is eventually execucted by aggregate.data.frame, using the second argument chickwts, a data.frame.

所以聚合调度它的第一个参数(类公式)。在R中解析公式的方式通常围绕一个model.matrix,我假设在这里发生了类似的事情,并且最终由aggregate.data.frame执行等效调用,使用第二个参数chickwts,一个data.frame。

R> aggregate(weight ~ feed, data = chickwts, mean)
       feed  weight
1    casein 323.583
2 horsebean 160.200
3   linseed 218.750
4  meatmeal 276.909
5   soybean 246.429
6 sunflower 328.917
R> 

What you asked isn't the easiest beginner question, I'd recommend a good thorough look at some of the documentation and a decent R book if you have one handy. (And other SO questions give recommendation as to what to read next.)

你问的不是最简单的初学者问题,如果你有一个方便的话,我建议你仔细查看一些文档和一本体面的R书。 (以及其他SO问题提供了关于下一步要阅读的内容的建议。)

Edit: I had to dig a little as aggregate.formula() is not exported from stats namespace, but you can look at it by typing stats:::aggregate.formula at the prompt -- which then clearly shows that it does, in fact, dispatch to aggregate.data.frame():

编辑:我不得不挖掘一点,因为aggregate.formula()不是从stats命名空间导出的,但你可以通过在提示符下键入stats ::: aggregate.formula来查看它 - 然后清楚地显示它确实存在,事实上,派遣到aggregate.data.frame():

 [.... some code omitted ...]
    if (is.matrix(mf[[1L]])) {
        lhs <- as.data.frame(mf[[1L]])
        names(lhs) <- as.character(m[[2L]][[2L]])[-1L]
        aggregate.data.frame(lhs, mf[-1L], FUN = FUN, ...)
    }
    else aggregate.data.frame(mf[1L], mf[-1L], FUN = FUN, ...)
}
<environment: namespace:stats>
R> 

#1


10  

Jump to the middle of the examples section of help(aggregate) and you will see this:

跳到帮助(聚合)的示例部分的中间,您将看到:

 ## Formulas, one ~ one, one ~ many, many ~ one, and many ~ many:
 aggregate(weight ~ feed, data = chickwts, mean)
 aggregate(breaks ~ wool + tension, data = warpbreaks, mean)
 aggregate(cbind(Ozone, Temp) ~ Month, data = airquality, mean)
 aggregate(cbind(ncases, ncontrols) ~ alcgp + tobgp, data = esoph, sum)

Four different calls to aggregate(), all using the formula interface. The way it is written above in what you quote has to do with method dispatching mechanism used throughout R.

四个不同的聚合调用(),都使用公式接口。上面用你引用的内容编写的方式与整个R中使用的方法调度机制有关。

Consider the first example:

考虑第一个例子:

R> class(weight ~ feed)
[1] "formula"
R> class(chickwts)
[1] "data.frame"

so aggregate dispatches on it first argument (of class formula). The way a formula gets resolved in R typically revolves around a model.matrix, I presume something similar happens here and an equivalent call is eventually execucted by aggregate.data.frame, using the second argument chickwts, a data.frame.

所以聚合调度它的第一个参数(类公式)。在R中解析公式的方式通常围绕一个model.matrix,我假设在这里发生了类似的事情,并且最终由aggregate.data.frame执行等效调用,使用第二个参数chickwts,一个data.frame。

R> aggregate(weight ~ feed, data = chickwts, mean)
       feed  weight
1    casein 323.583
2 horsebean 160.200
3   linseed 218.750
4  meatmeal 276.909
5   soybean 246.429
6 sunflower 328.917
R> 

What you asked isn't the easiest beginner question, I'd recommend a good thorough look at some of the documentation and a decent R book if you have one handy. (And other SO questions give recommendation as to what to read next.)

你问的不是最简单的初学者问题,如果你有一个方便的话,我建议你仔细查看一些文档和一本体面的R书。 (以及其他SO问题提供了关于下一步要阅读的内容的建议。)

Edit: I had to dig a little as aggregate.formula() is not exported from stats namespace, but you can look at it by typing stats:::aggregate.formula at the prompt -- which then clearly shows that it does, in fact, dispatch to aggregate.data.frame():

编辑:我不得不挖掘一点,因为aggregate.formula()不是从stats命名空间导出的,但你可以通过在提示符下键入stats ::: aggregate.formula来查看它 - 然后清楚地显示它确实存在,事实上,派遣到aggregate.data.frame():

 [.... some code omitted ...]
    if (is.matrix(mf[[1L]])) {
        lhs <- as.data.frame(mf[[1L]])
        names(lhs) <- as.character(m[[2L]][[2L]])[-1L]
        aggregate.data.frame(lhs, mf[-1L], FUN = FUN, ...)
    }
    else aggregate.data.frame(mf[1L], mf[-1L], FUN = FUN, ...)
}
<environment: namespace:stats>
R>