
时间:2021-01-12 14:58:21

I have a large set of model specifications to test, which share a dv but have unique IVs. In the following example


foo <- data.frame(dv  = sample(c(0,1), 100, replace=T),
                  x1 = runif(100),
                  x2 = runif(100))

I want the first model to only include x1, the second x2, the third both, and the fourth their interaction. So I thought a sensible way would be to build a list of formula statements:


bar <- list("x1",

which I would then use in a llply call from the plyr package to obtain a list of model objects.


res <- llply(bar, function(i) glm(dv ~ i, data = foo, family = binomial()))

Unfortunately I'm told


Error in model.frame.default(formula = dv ~ i, data = foo, drop.unused.levels = TRUE):variable lengths differ (found for 'i')

Obviously I'm mixing up something fundamental--do I need to manipulate the original foo list in some fashion?

显然我正在混淆一些基本的东西 - 我是否需要以某种方式操纵原始的foo列表?

2 个解决方案



The problem is that dv ~ i isn't a formula. i is (inside the anonymous function) simply a symbol that represents a variable containing a character value.


Try this:

bar <- list("dv~x1",

res <- llply(bar, function(i) glm(i, data = foo, family = binomial()))

But setting statistical issues aside, it might possibly be easier to use something like ?step or ?stepAIC in the MASS package for tasks similar to this?




Your problem is with how you are specifying the formula, since inside the function i is a variable. This would work:


glm(paste("dv ~", i), data = foo, family = binomial())



The problem is that dv ~ i isn't a formula. i is (inside the anonymous function) simply a symbol that represents a variable containing a character value.


Try this:

bar <- list("dv~x1",

res <- llply(bar, function(i) glm(i, data = foo, family = binomial()))

But setting statistical issues aside, it might possibly be easier to use something like ?step or ?stepAIC in the MASS package for tasks similar to this?




Your problem is with how you are specifying the formula, since inside the function i is a variable. This would work:


glm(paste("dv ~", i), data = foo, family = binomial())