I want to find out the number of tweets, favourites and retweets (cummulative is enough) of the UK General Election candidates of several parties (>2000 candidates) in the 2 months before the election. So far I have tried to make a loop using TwitteR's usertimeline, and then (in the loop, because I don't know how to save it otherwise) saving the number of tweets and retweets and favourites.
我想在选举前的2个月内找出几个政党(> 2000名候选人)的英国大选候选人的推文,收藏和转推(累计足够)的数量。到目前为止,我已经尝试使用TwitteR的usertimeline进行循环,然后(在循环中,因为我不知道如何保存它),节省了推文和转推和收藏的数量。
current is the list with twitter usernames. I'm a programming newby, so please don't hate:
current是包含twitter用户名的列表。我是新手编程,所以请不要讨厌:
tweetsy.2017 <- function(x){
one = userTimeline(x, n =3200, includeRts = TRUE,excludeReplies=FALSE)
onedf = twListToDF(one)
oneperiod = subset(onedf, created >= as.POSIXct('2017-04-18 00:00:00') & created <= as.POSIXct('2017-06-08 23:59:00')) #61 days
oneperiod2 = oneperiod[oneperiod$isRetweet == FALSE,]
ro = nrow(oneperiod)
f = sum(oneperiod$favoriteCount)
re = sum(oneperiod$retweetCount)
output = list(ro, f, re)
return(output)
#Sys.sleep(100)
}
Tweets.2017 = lapply(current, tweetsy.2017)
My problem is, that this takes very long and gives no intermediate data. Also, it seems inefficient to download all the tweets just to get the number of them. Oh, and I just put the sleep there in case I reach the API Limit, but it seems like my code is too slow to reach it anyway.
我的问题是,这需要很长时间并且不提供中间数据。此外,下载所有推文只是为了获得它们的数量似乎效率低下。哦,我只是把睡眠放在那里以防我达到API限制,但似乎我的代码太慢而无法到达它。
Does anybody have a better Idea? I have tried mclapply and parLapply but haven't managed to get them running..
有没有人有更好的想法?我已经尝试过mclapply和parLapply但是没有设法让它们运行..
1 个解决方案
#1
0
Wrapped it into a for loop, so I can have intermediate results. Works fine now!
把它包装成for循环,所以我可以得到中间结果。现在工作正常!
for(i in 1:nrow(current)){
print(paste("Row number ", i , " of ", nrow(twitter_data)))
id <- twitter_data[i, 1]
print(as.vector(id))
ab[[i]] <- tweetsy.2017(id)
print("Process sleeps for a few seconds due to twitter API security
issues and then it will continue")
Sys.sleep(9)
}
#1
0
Wrapped it into a for loop, so I can have intermediate results. Works fine now!
把它包装成for循环,所以我可以得到中间结果。现在工作正常!
for(i in 1:nrow(current)){
print(paste("Row number ", i , " of ", nrow(twitter_data)))
id <- twitter_data[i, 1]
print(as.vector(id))
ab[[i]] <- tweetsy.2017(id)
print("Process sleeps for a few seconds due to twitter API security
issues and then it will continue")
Sys.sleep(9)
}