Just getting started using R and I need some help in understanding the application of for/nested loop.
刚刚开始使用R,我需要一些帮助来理解for / nested循环的应用。
StudyID<-c(1:5)
SubjectID<-c(1:5)
df<-data.frame(StudyID=rep(StudyID, each=5), SubjectID=rep(SubjectID, each=1))
How can I create a new column called as ID, which would use the combination of studyID
and subjectID
to create a unique ID ?
如何创建一个名为ID的新列,它将使用studyID和subjectID的组合来创建唯一ID?
So for this data, unique ID should be from 1:25.
因此,对于此数据,唯一ID应为1:25。
So the final data looks like this:
所以最终数据看起来像这样:
UniqueID<- c(1:25)
df<-cbind(df,UniqueID)
View(df)
Is there any other way which is faster and more efficient that looping ?
有没有其他方法可以更快更有效地循环?
3 个解决方案
#1
2
Using the dplyr
package, you could do:
使用dplyr包,您可以:
library(dplyr)
df$Id = group_indices(df,StudyID,SubjectID)
This returns:
#StudyID SubjectID Id
# 1 1 1
# 1 2 2
# 1 3 3
# 1 4 4
# 1 5 5
# 2 1 6
# 2 2 7
# 2 3 8
# 2 4 9
# 2 5 10
# 3 1 11
# 3 3 13
# 3 4 14
# 3 5 15
# 4 1 16
# 4 2 17
# 4 3 18
# 4 4 19
# 4 5 20
# 5 1 21
# 5 2 22
# 5 3 23
# 5 4 24
# 5 5 25
#2
2
Another method to achieve that without loading any library (base R) would be this (assuming data frame is sorted based on the two columns):
在没有加载任何库(基础R)的情况下实现该方法的另一种方法是(假设数据框基于两列进行排序):
StudyID<-c(1:5)
SubjectID<-c(1:5)
df<-data.frame(StudyID=rep(StudyID, each=5), SubjectID=rep(SubjectID, each=1))
df$uniqueID <- cumsum(!duplicated(df[1:2]))
or you can use this solution, mentioned in the comments (I prefer this over the first solution):
或者你可以使用评论中提到的这个解决方案(我更倾向于第一个解决方案):
df$uniqueID <- as.numeric(factor(do.call(paste, df)))
The output would be:
输出将是:
> print(df, row.names = FALSE)
#StudyID SubjectID uniqueID
# 1 1 1
# 1 2 2
# 1 3 3
# 1 4 4
# 1 5 5
# 2 1 6
# 2 2 7
# 2 3 8
# 2 4 9
# 2 5 10
# 3 1 11
# 3 2 12
# 3 3 13
# 3 4 14
# 3 5 15
# 4 1 16
# 4 2 17
# 4 3 18
# 4 4 19
# 4 5 20
# 5 1 21
# 5 2 22
# 5 3 23
# 5 4 24
# 5 5 25
#3
1
You could go for interaction
in base R:
您可以在基地R中进行互动:
df$uniqueID <- with(df, as.integer(interaction(StudyID,SubjectID)))
For example (this example expresses better what you are after):
例如(这个例子表达了你所追求的更好):
set.seed(10)
df <- data.frame(StudyID=sample(5,10,replace = T), SubjectID=rep(1:5,times=2))
df$uniqueID <- with(df, as.integer(interaction(StudyID,SubjectID)))
# StudyID SubjectID uniqueID
# 1 3 1 3
# 2 2 2 6
# 3 3 3 11
# 4 4 4 16
# 5 1 5 17
# 6 2 1 2
# 7 2 2 6
# 8 2 3 10
# 9 4 4 16
# 10 3 5 19
#1
2
Using the dplyr
package, you could do:
使用dplyr包,您可以:
library(dplyr)
df$Id = group_indices(df,StudyID,SubjectID)
This returns:
#StudyID SubjectID Id
# 1 1 1
# 1 2 2
# 1 3 3
# 1 4 4
# 1 5 5
# 2 1 6
# 2 2 7
# 2 3 8
# 2 4 9
# 2 5 10
# 3 1 11
# 3 3 13
# 3 4 14
# 3 5 15
# 4 1 16
# 4 2 17
# 4 3 18
# 4 4 19
# 4 5 20
# 5 1 21
# 5 2 22
# 5 3 23
# 5 4 24
# 5 5 25
#2
2
Another method to achieve that without loading any library (base R) would be this (assuming data frame is sorted based on the two columns):
在没有加载任何库(基础R)的情况下实现该方法的另一种方法是(假设数据框基于两列进行排序):
StudyID<-c(1:5)
SubjectID<-c(1:5)
df<-data.frame(StudyID=rep(StudyID, each=5), SubjectID=rep(SubjectID, each=1))
df$uniqueID <- cumsum(!duplicated(df[1:2]))
or you can use this solution, mentioned in the comments (I prefer this over the first solution):
或者你可以使用评论中提到的这个解决方案(我更倾向于第一个解决方案):
df$uniqueID <- as.numeric(factor(do.call(paste, df)))
The output would be:
输出将是:
> print(df, row.names = FALSE)
#StudyID SubjectID uniqueID
# 1 1 1
# 1 2 2
# 1 3 3
# 1 4 4
# 1 5 5
# 2 1 6
# 2 2 7
# 2 3 8
# 2 4 9
# 2 5 10
# 3 1 11
# 3 2 12
# 3 3 13
# 3 4 14
# 3 5 15
# 4 1 16
# 4 2 17
# 4 3 18
# 4 4 19
# 4 5 20
# 5 1 21
# 5 2 22
# 5 3 23
# 5 4 24
# 5 5 25
#3
1
You could go for interaction
in base R:
您可以在基地R中进行互动:
df$uniqueID <- with(df, as.integer(interaction(StudyID,SubjectID)))
For example (this example expresses better what you are after):
例如(这个例子表达了你所追求的更好):
set.seed(10)
df <- data.frame(StudyID=sample(5,10,replace = T), SubjectID=rep(1:5,times=2))
df$uniqueID <- with(df, as.integer(interaction(StudyID,SubjectID)))
# StudyID SubjectID uniqueID
# 1 3 1 3
# 2 2 2 6
# 3 3 3 11
# 4 4 4 16
# 5 1 5 17
# 6 2 1 2
# 7 2 2 6
# 8 2 3 10
# 9 4 4 16
# 10 3 5 19