I want to merge two data frames but have the resulting merged data frame have only the "necessary" number of levels in one of its variables. Like this:
我想合并两个数据帧,但得到的合并数据帧只有一个变量中的“必要”级别。喜欢这个:
df1 <- data.frame(country=c("AA", "BB"))
df2 <- data.frame(country=c("AA", "BB", "CC"), name=c("Country A", "Country B", "Country C"))
df3 <- merge(df1, df2, by="country")
Then:
然后:
> df3
country name
1 AA Country A
2 BB Country B
which is what I expected.
这是我的预期。
However, why are there 3 levels for factor 'name' if there are only 2 lines of data?
但是,如果只有2行数据,为什么因子'name'有3个级别?
> str(df3)
'data.frame': 2 obs. of 2 variables:
$ country: Factor w/ 2 levels "AA","BB": 1 2
$ name : Factor w/ 3 levels "Country A","Country B",..: 1 2
How do I get rid of 'Country C' in df3?
如何摆脱df3中的“Country C”?
> table(df3)
name
country Country A Country B Country C
AA 1 0 0
BB 0 1 0
1 个解决方案
#1
1
You could try:
你可以尝试:
table(droplevels(df3))
# name
#country Country A Country B
# AA 1 0
# BB 0 1
Here the levels of df2$name
are not dropped while you do the merge
. Another way would be to:
在这里,合并时不会删除df2 $ name的级别。另一种方式是:
df3$name <- factor(df3$name)
table(df3)
# name
#country Country A Country B
# AA 1 0
# BB 0 1
#1
1
You could try:
你可以尝试:
table(droplevels(df3))
# name
#country Country A Country B
# AA 1 0
# BB 0 1
Here the levels of df2$name
are not dropped while you do the merge
. Another way would be to:
在这里,合并时不会删除df2 $ name的级别。另一种方式是:
df3$name <- factor(df3$name)
table(df3)
# name
#country Country A Country B
# AA 1 0
# BB 0 1