从列表中传递glm预测变量

时间:2021-01-12 14:58:21

I have a large set of model specifications to test, which share a dv but have unique IVs. In the following example

我有一大堆要测试的模型规格,它们共享一个dv但具有独特的IV。在以下示例中

foo <- data.frame(dv  = sample(c(0,1), 100, replace=T),
                  x1 = runif(100),
                  x2 = runif(100))

I want the first model to only include x1, the second x2, the third both, and the fourth their interaction. So I thought a sensible way would be to build a list of formula statements:

我希望第一个模型只包括x1,第二个x2,第三个模型和第四个模型。所以我认为一种明智的方法是建立一个公式语句列表:

bar <- list("x1",
            "x2", 
            "x1+x2",
            "x1*x2")

which I would then use in a llply call from the plyr package to obtain a list of model objects.

然后我将在plyr包的llply调用中使用它来获取模型对象列表。

require(plyr)
res <- llply(bar, function(i) glm(dv ~ i, data = foo, family = binomial()))

Unfortunately I'm told

不幸的是我被告知了

Error in model.frame.default(formula = dv ~ i, data = foo, drop.unused.levels = TRUE):variable lengths differ (found for 'i')

Obviously I'm mixing up something fundamental--do I need to manipulate the original foo list in some fashion?

显然我正在混淆一些基本的东西 - 我是否需要以某种方式操纵原始的foo列表?

2 个解决方案

#1


2  

The problem is that dv ~ i isn't a formula. i is (inside the anonymous function) simply a symbol that represents a variable containing a character value.

问题是dv~i不是公式。我(在匿名函数内)只是一个表示包含字符值的变量的符号。

Try this:

bar <- list("dv~x1",
            "dv~x2", 
            "dv~x1+x2",
            "dv~x1*x2")

res <- llply(bar, function(i) glm(i, data = foo, family = binomial()))

But setting statistical issues aside, it might possibly be easier to use something like ?step or ?stepAIC in the MASS package for tasks similar to this?

但是,除了设置统计问题之外,在MASS包中使用类似?step或?stepAIC的东西可能更容易用于类似的任务?

#2


3  

Your problem is with how you are specifying the formula, since inside the function i is a variable. This would work:

您的问题在于如何指定公式,因为在函数内部我是一个变量。这可行:

glm(paste("dv ~", i), data = foo, family = binomial())

#1


2  

The problem is that dv ~ i isn't a formula. i is (inside the anonymous function) simply a symbol that represents a variable containing a character value.

问题是dv~i不是公式。我(在匿名函数内)只是一个表示包含字符值的变量的符号。

Try this:

bar <- list("dv~x1",
            "dv~x2", 
            "dv~x1+x2",
            "dv~x1*x2")

res <- llply(bar, function(i) glm(i, data = foo, family = binomial()))

But setting statistical issues aside, it might possibly be easier to use something like ?step or ?stepAIC in the MASS package for tasks similar to this?

但是,除了设置统计问题之外,在MASS包中使用类似?step或?stepAIC的东西可能更容易用于类似的任务?

#2


3  

Your problem is with how you are specifying the formula, since inside the function i is a variable. This would work:

您的问题在于如何指定公式,因为在函数内部我是一个变量。这可行:

glm(paste("dv ~", i), data = foo, family = binomial())