如何为列中的每组相同值分配唯一的ID号[duplicate]

时间:2023-01-22 22:56:31

This question already has an answer here:

这个问题已经有了答案:

I have a data frame with a number of columns. I would like to create a new column called “id” that gives a unique id number to each group of identical values in the “sample” column.

我有一个有很多列的数据框架。我想创建一个名为“id”的新列,该列为“sample”列中的每组相同值提供唯一的id号。

Example data:

示例数据:

# dput(df)df <- structure(list(index = 1:30, val = c(14L, 22L, 1L, 25L, 3L, 34L, 35L, 36L, 24L, 35L, 33L, 31L, 30L, 30L, 29L, 28L, 26L, 12L, 41L, 36L, 32L, 37L, 56L, 34L, 23L, 24L, 28L, 22L, 10L, 19L), sample = c(5L, 6L, 6L, 7L, 7L, 7L, 8L, 9L, 10L, 11L, 11L, 12L, 13L, 14L, 14L, 15L, 15L, 15L, 16L, 17L, 18L, 18L, 19L, 19L, 19L, 20L, 21L, 22L, 23L, 23L)), .Names = c("index", "val", "sample"), class = "data.frame", row.names = c(NA, -30L))head(df)  index val sample 1     1  14      5  2     2  22      6  3     3   1      6  4     4  25      7  5     5   3      7  6     6  34      7  

What I would like to end up with:

最后我想说的是:

  index val sample id1     1  14      5  12     2  22      6  23     3   1      6  24     4  25      7  35     5   3      7  36     6  34      7  3

2 个解决方案

#1


48  

How about

如何

df2 <- transform(df,id=as.numeric(factor(sample)))

?

吗?

I think this (cribbed from Creating a unique ID) should be slightly more efficient, although perhaps a little harder to remember:

我认为这个(从创建一个唯一的ID开始)应该会稍微高效一点,尽管可能有点难以记住:

df3 <- transform(df, id=match(sample, unique(sample)))all.equal(df2,df3)  ## TRUE

#2


33  

Here's a data.table solution

这里有一个数据。表解决方案

library(data.table)setDT(df)[, id := .GRP, by = sample]

#1


48  

How about

如何

df2 <- transform(df,id=as.numeric(factor(sample)))

?

吗?

I think this (cribbed from Creating a unique ID) should be slightly more efficient, although perhaps a little harder to remember:

我认为这个(从创建一个唯一的ID开始)应该会稍微高效一点,尽管可能有点难以记住:

df3 <- transform(df, id=match(sample, unique(sample)))all.equal(df2,df3)  ## TRUE

#2


33  

Here's a data.table solution

这里有一个数据。表解决方案

library(data.table)setDT(df)[, id := .GRP, by = sample]