R - 合并并产生因子水平

时间:2022-11-15 07:36:23

I want to merge two data frames but have the resulting merged data frame have only the "necessary" number of levels in one of its variables. Like this:

我想合并两个数据帧,但得到的合并数据帧只有一个变量中的“必要”级别。喜欢这个:

df1 <- data.frame(country=c("AA", "BB"))
df2 <- data.frame(country=c("AA", "BB", "CC"), name=c("Country A", "Country B", "Country C"))
df3 <- merge(df1, df2, by="country")

Then:

然后:

> df3
  country      name
1      AA Country A
2      BB Country B

which is what I expected.

这是我的预期。

However, why are there 3 levels for factor 'name' if there are only 2 lines of data?

但是,如果只有2行数据,为什么因子'name'有3个级别?

> str(df3)
'data.frame':   2 obs. of  2 variables:
 $ country: Factor w/ 2 levels "AA","BB": 1 2
 $ name   : Factor w/ 3 levels "Country A","Country B",..: 1 2

How do I get rid of 'Country C' in df3?

如何摆脱df3中的“Country C”?

> table(df3)
       name
country Country A Country B Country C
     AA         1         0         0
     BB         0         1         0

1 个解决方案

#1


1  

You could try:

你可以尝试:

table(droplevels(df3))
#         name
#country Country A Country B
# AA         1         0
# BB         0         1

Here the levels of df2$name are not dropped while you do the merge. Another way would be to:

在这里,合并时不会删除df2 $ name的级别。另一种方式是:

 df3$name <- factor(df3$name)
 table(df3)
 #     name
#country Country A Country B
# AA         1         0
# BB         0         1

#1


1  

You could try:

你可以尝试:

table(droplevels(df3))
#         name
#country Country A Country B
# AA         1         0
# BB         0         1

Here the levels of df2$name are not dropped while you do the merge. Another way would be to:

在这里,合并时不会删除df2 $ name的级别。另一种方式是:

 df3$name <- factor(df3$name)
 table(df3)
 #     name
#country Country A Country B
# AA         1         0
# BB         0         1