从tidyr上涂抹到ffdf上

时间:2022-07-22 14:57:09

On a normal dataframe I could spread out all data according to a particular column. But how can I do this on an ffdf .

在一个正常的dataframe上,我可以根据特定的列展开所有数据。但是我如何在ffdf上做到这一点。

I have an input like this.

我有这样的输入。

         Uid      article_Topic frqnu
1 1234567890      Cricket  2
2 1234567891      Cricket  3
3 1234567892      Cricket  4
4       abcd      Cricket  5
5 1234567894      Cricket  6
6 1234567890 Food Package  2
7 1234567895     FootBall  7

spread(data=ffg1,article_Topic,frqnu,fill=0) on a data.frame gives

在数据上的扩展(data=ffg1,article_Topic,frqnu,fill=0)

      userID Cricket Food Package FootBall 
1 1234567890       2            2        0 
2 1234567891       3            0        0 
3 1234567892       4            0        0 
4 1234567894       6            0        0 
5 1234567895       0            0        7 
6 1234567896       0            0        0 
7       abcd       5            0        0 

Any other way of achieving a similar output would also be of help. I need to do this on an ffdf I am not much familiar with R as of now. Any help is appreciated.

实现类似产出的任何其他方法也将有所帮助。我需要在一个我现在还不太熟悉R的ffdf上做这个。任何帮助都是感激。

Update : I tried to do this

更新:我试着这样做。

library(ff)
library(ffbase)
library(dplyr)
library(tidyr)
ffg= read.csv.ffdf(file="text.txt",header=FALSE,sep="\t")
colnames(ffg)<-c("userID","article_Topic","frqnu")
spread(data=ffg,article_Topic,frqnu,fill=0)

Which gives an error : no applicable method for 'spread_' applied to an object of class "ffdf"

这就产生了一个错误:应用于类“ffdf”的对象的‘spread_’没有适用的方法

1 个解决方案

#1


2  

We could use ffdfdply from library(ffbase) to perform a split-apply-combine on a ffdf object. It splits the object according to the the split, applies the FUN to the 'data', and stores the result as an ffdf object. So, inside the FUN, we can use our regular dcast

我们可以使用library(ffbase)中的ffdfdply在ffdf对象上执行分割应用程序组合。它根据分割对象分割对象,将乐趣应用到“数据”中,并将结果存储为ffdf对象。所以,有趣的是,我们可以使用常规的dcast

library(ffbase)
library(reshape2)
ffdfdply(x=ffg, split=ffg$userID, FUN= function(x) {
          dcast(x, userID~article_Topic, value.var='frqnu', fill=0)
 })

Or spread syntax.

或传播的语法。

library(tidyr)
ffdfdply(x=ffg, split=ffg$userID, FUN= function(x) {
         spread(x, article_Topic, frqnu, fill=0)
})

#1


2  

We could use ffdfdply from library(ffbase) to perform a split-apply-combine on a ffdf object. It splits the object according to the the split, applies the FUN to the 'data', and stores the result as an ffdf object. So, inside the FUN, we can use our regular dcast

我们可以使用library(ffbase)中的ffdfdply在ffdf对象上执行分割应用程序组合。它根据分割对象分割对象,将乐趣应用到“数据”中,并将结果存储为ffdf对象。所以,有趣的是,我们可以使用常规的dcast

library(ffbase)
library(reshape2)
ffdfdply(x=ffg, split=ffg$userID, FUN= function(x) {
          dcast(x, userID~article_Topic, value.var='frqnu', fill=0)
 })

Or spread syntax.

或传播的语法。

library(tidyr)
ffdfdply(x=ffg, split=ffg$userID, FUN= function(x) {
         spread(x, article_Topic, frqnu, fill=0)
})