I have a data frame called data.df with various columns say col1,col2,col3....col15. The data frame does not have a specific class attribute but any attribute could be potentially used as a class variable. I would like to use an R variable called target which points to the column number to be treated as class as follows :
我有一个名为data.df的数据框,其中包含各种列,例如col1,col2,col3 .... col15。数据框没有特定的类属性,但任何属性都可能用作类变量。我想使用一个名为target的R变量,它指向要被视为类的列号,如下所示:
target<-data.df$col3
and then use that field (target) as input to several learners such as PART and J48 (from package RWeka) :
然后使用该字段(目标)作为几个学习者的输入,例如PART和J48(来自包RWeka):
part<-PART(target~.,data=data.df,control=Weka_control(M=200,R=FALSE))
j48<-J48(target~.,data=data.df,control=Weka_control(M=200,R=FALSE))
The idea is to be able to change 'target' only once at the beginning of my R code. How can this be done?
我的想法是能够在我的R代码开头只改变一次'目标'。如何才能做到这一点?
2 个解决方案
#1
11
I sometimes manage to get a lot done by using strings to reference columns. It works like this:
我有时通过使用字符串来引用列来完成很多工作。它的工作原理如下:
> df <- data.frame(numbers=seq(5))
> df
numbers
1 1
2 2
3 3
4 4
5 5
> df$numbers
[1] 1 2 3 4 5
> df[['numbers']]
[1] 1 2 3 4 5
You can then have a variable target
be the name of your desired column as a string. I don't know about RWeka, but many libraries such as ggplot can take string references for columns (e.g. the aes_string
parameter instead of aes
).
然后,您可以将变量目标作为所需列的名称作为字符串。我不知道RWeka,但许多库如ggplot可以为列提取字符串引用(例如aes_string参数而不是aes)。
#2
6
If you ask about using references in R, it is impossible.
如果你询问在R中使用引用,那是不可能的。
However, if you ask about getting a column by name not explicitly given, this is possible with [
operator, like this:
但是,如果您询问是否按名称获取未明确给出的列,则可以使用[运算符,如下所示:
theNameOfColumnIwantToGetSummaryOf<-"col3"
summary(data.df[,theNameOfColumnIwantToGetSummaryOf])
...or like that:
......或者那样:
myIndexOfTheColumnIwantToGetSummaryOf<-3
summary(data.df[,sprintf("col%d",myIndexOfTheColumnIwantToGetSummaryOf)])
#1
11
I sometimes manage to get a lot done by using strings to reference columns. It works like this:
我有时通过使用字符串来引用列来完成很多工作。它的工作原理如下:
> df <- data.frame(numbers=seq(5))
> df
numbers
1 1
2 2
3 3
4 4
5 5
> df$numbers
[1] 1 2 3 4 5
> df[['numbers']]
[1] 1 2 3 4 5
You can then have a variable target
be the name of your desired column as a string. I don't know about RWeka, but many libraries such as ggplot can take string references for columns (e.g. the aes_string
parameter instead of aes
).
然后,您可以将变量目标作为所需列的名称作为字符串。我不知道RWeka,但许多库如ggplot可以为列提取字符串引用(例如aes_string参数而不是aes)。
#2
6
If you ask about using references in R, it is impossible.
如果你询问在R中使用引用,那是不可能的。
However, if you ask about getting a column by name not explicitly given, this is possible with [
operator, like this:
但是,如果您询问是否按名称获取未明确给出的列,则可以使用[运算符,如下所示:
theNameOfColumnIwantToGetSummaryOf<-"col3"
summary(data.df[,theNameOfColumnIwantToGetSummaryOf])
...or like that:
......或者那样:
myIndexOfTheColumnIwantToGetSummaryOf<-3
summary(data.df[,sprintf("col%d",myIndexOfTheColumnIwantToGetSummaryOf)])