On a normal dataframe I could spread out all data according to a particular column. But how can I do this on an ffdf .
在一个正常的dataframe上,我可以根据特定的列展开所有数据。但是我如何在ffdf上做到这一点。
I have an input like this.
我有这样的输入。
Uid article_Topic frqnu
1 1234567890 Cricket 2
2 1234567891 Cricket 3
3 1234567892 Cricket 4
4 abcd Cricket 5
5 1234567894 Cricket 6
6 1234567890 Food Package 2
7 1234567895 FootBall 7
spread(data=ffg1,article_Topic,frqnu,fill=0) on a data.frame gives
在数据上的扩展(data=ffg1,article_Topic,frqnu,fill=0)
userID Cricket Food Package FootBall
1 1234567890 2 2 0
2 1234567891 3 0 0
3 1234567892 4 0 0
4 1234567894 6 0 0
5 1234567895 0 0 7
6 1234567896 0 0 0
7 abcd 5 0 0
Any other way of achieving a similar output would also be of help. I need to do this on an ffdf I am not much familiar with R as of now. Any help is appreciated.
实现类似产出的任何其他方法也将有所帮助。我需要在一个我现在还不太熟悉R的ffdf上做这个。任何帮助都是感激。
Update : I tried to do this
更新:我试着这样做。
library(ff)
library(ffbase)
library(dplyr)
library(tidyr)
ffg= read.csv.ffdf(file="text.txt",header=FALSE,sep="\t")
colnames(ffg)<-c("userID","article_Topic","frqnu")
spread(data=ffg,article_Topic,frqnu,fill=0)
Which gives an error : no applicable method for 'spread_' applied to an object of class "ffdf"
这就产生了一个错误:应用于类“ffdf”的对象的‘spread_’没有适用的方法
1 个解决方案
#1
2
We could use ffdfdply
from library(ffbase)
to perform a split-apply-combine on a ffdf
object. It splits the object according to the the split
, applies the FUN
to the 'data', and stores the result as an ffdf
object. So, inside the FUN
, we can use our regular dcast
我们可以使用library(ffbase)中的ffdfdply在ffdf对象上执行分割应用程序组合。它根据分割对象分割对象,将乐趣应用到“数据”中,并将结果存储为ffdf对象。所以,有趣的是,我们可以使用常规的dcast
library(ffbase)
library(reshape2)
ffdfdply(x=ffg, split=ffg$userID, FUN= function(x) {
dcast(x, userID~article_Topic, value.var='frqnu', fill=0)
})
Or spread
syntax.
或传播的语法。
library(tidyr)
ffdfdply(x=ffg, split=ffg$userID, FUN= function(x) {
spread(x, article_Topic, frqnu, fill=0)
})
#1
2
We could use ffdfdply
from library(ffbase)
to perform a split-apply-combine on a ffdf
object. It splits the object according to the the split
, applies the FUN
to the 'data', and stores the result as an ffdf
object. So, inside the FUN
, we can use our regular dcast
我们可以使用library(ffbase)中的ffdfdply在ffdf对象上执行分割应用程序组合。它根据分割对象分割对象,将乐趣应用到“数据”中,并将结果存储为ffdf对象。所以,有趣的是,我们可以使用常规的dcast
library(ffbase)
library(reshape2)
ffdfdply(x=ffg, split=ffg$userID, FUN= function(x) {
dcast(x, userID~article_Topic, value.var='frqnu', fill=0)
})
Or spread
syntax.
或传播的语法。
library(tidyr)
ffdfdply(x=ffg, split=ffg$userID, FUN= function(x) {
spread(x, article_Topic, frqnu, fill=0)
})