I am trying to filter out the duplicates of a subset of columns from a dataframe in R.
我试图从R中的数据帧中过滤掉一部分列的副本。
I am interested in filtering unique combinations of session, first, and last. The following is what my data looks like
我有兴趣过滤会话,第一个和最后一个的唯一组合。以下是我的数据
session first last city
1 9cf571c8faa67cad2aa9ff41f3a26e38 cat biddix fresno
2 e30f853d4e54604fd62858badb68113a caleb amos
3 2ad41134cc285bcc06892fd68a471cd7 daniel folkers
4 2ad41134cc285bcc06892fd68a471cd7 daniel folkers
5 63a5e839510a647c1ff3b8aed684c2a5 charles pierce flint
6 691df47f2df12f14f000f9a17d1cc40e j franz prescott+valley
7 691df47f2df12f14f000f9a17d1cc40e j franz prescott+valley
8 b3a1476aa37ae4b799495256324a8d3d carrie mascorro brea
9 bd9f1404b313415e7e7b8769376d2705 fred morales las+vegas
10 b50a610292803dc302f24ae507ea853a aurora lee
11 fb74940e6feb0dc61a1b4d09fcbbcb37 andrew price yorkville
2 个解决方案
#1
33
The following should do it:
以下应该这样做:
unique(df[,c('session','first','last')])
where df
is your data frame.
其中df是您的数据框。
#2
2
Can't comment yet but this is is a response to Climbs_lika_Spyder.
无法发表评论,但这是对Climbs_lika_Spyder的回应。
You can get unique counts using the plyr library count function
您可以使用plyr库计数功能获得唯一计数
library('plyr')
A=rep(c('a','b'),4)
B=rep(c('c','d'),each=4)
df=data.frame(A,B)
count(df,vars = c('A','B'))
#1
33
The following should do it:
以下应该这样做:
unique(df[,c('session','first','last')])
where df
is your data frame.
其中df是您的数据框。
#2
2
Can't comment yet but this is is a response to Climbs_lika_Spyder.
无法发表评论,但这是对Climbs_lika_Spyder的回应。
You can get unique counts using the plyr library count function
您可以使用plyr库计数功能获得唯一计数
library('plyr')
A=rep(c('a','b'),4)
B=rep(c('c','d'),each=4)
df=data.frame(A,B)
count(df,vars = c('A','B'))