i have the following table:
我有下表:
mymatrix <- matrix(c(34,11,65,32,12,9,32,90,21,51,45,23), ncol=3)
colnames(mymatrix) <- c("pos", "neg", "neutr") # class
rownames(mymatrix) <- c("1 -1 0", "-1 -1 0", "0 -1 1", "0 0 1") # patterns
mytable <- as.table(mymatrix)
mytable
# pos neg neutr
# 1 -1 0 34 12 21
# -1 -1 0 11 9 51
# 0 -1 1 65 32 45
# 0 0 1 32 90 23
now i have new data with three columns. each row contains one of the patterns "1 -1 0", "-1 -1 0", "0 -1 1" and "0 0 1". so for example, my new data looks like this:
现在我有三列新数据。每行包含模式“1 -1 0”,“ - 1 0 0”,“0 -1 1”和“0 0 1”中的一个。例如,我的新数据如下所示:
one <- c( 1, 1, 0, -1, 0, 1, 1)
two <- c( -1, -1, -1, -1, 0, -1, -1)
three <- c(0, 0, 1, 0, 1, 0, 0)
mydf <- data.frame(one, two, three)
mydf
# one two three
# 1 1 -1 0
# 2 1 -1 0
# 3 0 -1 1
# 4 -1 -1 0
# 5 0 0 1
# 6 1 -1 0
# 7 1 -1 0
now i want to get a fourth column in mydf that assigns the class (pos, neg, neutr) to each row in mydf. the class with the highest frequency should be assigned.
现在我想在mydf中获得第四列,将类(pos,neg,neutr)分配给mydf中的每一行。应指定频率最高的班级。
it should look like this:
它应该是这样的:
# one two three four
# 1 1 -1 0 pos # (because for this pattern (1 1 -1), "pos" gets highest frequency in mytable.)
# 2 1 -1 0 pos
# 3 0 -1 1 pos
# 4 -1 -1 0 neutr
# 5 0 0 1 neg
# 6 1 -1 0 pos
# 7 1 -1 0 pos
how can i do that?
我怎样才能做到这一点?
thank you!
1 个解决方案
#1
1
In the first step you could learn the mapping from triple to label, and then you could look up the mapped value for each row of mydf
:
在第一步中,您可以学习从三元组到标签的映射,然后您可以查找mydf的每一行的映射值:
maxes = apply(mytable, 1, function(x) colnames(mytable)[which.max(x)])
mydf$four = maxes[match(paste(mydf$one, mydf$two, mydf$three), rownames(mytable))]
mydf
# mydf
# one two three four
# 1 1 -1 0 pos
# 2 1 -1 0 pos
# 3 0 -1 1 pos
# 4 -1 -1 0 neutr
# 5 0 0 1 neg
# 6 1 -1 0 pos
# 7 1 -1 0 pos
#1
1
In the first step you could learn the mapping from triple to label, and then you could look up the mapped value for each row of mydf
:
在第一步中,您可以学习从三元组到标签的映射,然后您可以查找mydf的每一行的映射值:
maxes = apply(mytable, 1, function(x) colnames(mytable)[which.max(x)])
mydf$four = maxes[match(paste(mydf$one, mydf$two, mydf$three), rownames(mytable))]
mydf
# mydf
# one two three four
# 1 1 -1 0 pos
# 2 1 -1 0 pos
# 3 0 -1 1 pos
# 4 -1 -1 0 neutr
# 5 0 0 1 neg
# 6 1 -1 0 pos
# 7 1 -1 0 pos