I have a dataframe that contains 8 columns and 10,000 rows. I would like to randomly sample 3 rows for all combinations of the "1" and "2" columns where the values are TRUE
(e.g 1a with 2a)
我有一个包含8列和10,000行的数据帧。我想为“1”和“2”列的所有组合随机抽样3行,其中值为TRUE(例如2a和2a)
My initial attempt is as such:
我最初的尝试是这样的:
df[sample(nrow(df[df$1a == TRUE & df$2a == TRUE,]), 3), ]
df [样本(nrow(df [df $ 1a == TRUE&df $ 2a == TRUE,]),3),]
Which gives the output.
这给出了输出。
1a 1b 1c 1d 2a 2b 2c 2d
1136 FALSE FALSE FALSE TRUE FALSE TRUE FALSE FALSE
1021 TRUE FALSE FALSE FALSE FALSE FALSE TRUE FALSE
589 FALSE TRUE FALSE FALSE FALSE TRUE FALSE FALSE
It's selecting rows which are FALSE for 1a and 2a. What am I doing wrong? Thank you very much.
它选择1a和2a为FALSE的行。我究竟做错了什么?非常感谢你。
1 个解决方案
#1
2
This piece of code
这段代码
df[df$1a == TRUE & df$2a == TRUE,]
should return 0 rows because there are no such cases.
应该返回0行,因为没有这种情况。
If your data frame has more than those 10 lines try to use TRUE as character:
如果您的数据框超过这10行,请尝试使用TRUE作为字符:
df[sample(nrow(df[df$1a == "TRUE" & df$2a == "TRUE",]), 3), ]
#1
2
This piece of code
这段代码
df[df$1a == TRUE & df$2a == TRUE,]
should return 0 rows because there are no such cases.
应该返回0行,因为没有这种情况。
If your data frame has more than those 10 lines try to use TRUE as character:
如果您的数据框超过这10行,请尝试使用TRUE作为字符:
df[sample(nrow(df[df$1a == "TRUE" & df$2a == "TRUE",]), 3), ]