使用布尔条件随机采样数据帧

时间:2022-08-22 22:55:05

I have a dataframe that contains 8 columns and 10,000 rows. I would like to randomly sample 3 rows for all combinations of the "1" and "2" columns where the values are TRUE (e.g 1a with 2a)

我有一个包含8列和10,000行的数据帧。我想为“1”和“2”列的所有组合随机抽样3行,其中值为TRUE(例如2a和2a)

使用布尔条件随机采样数据帧

My initial attempt is as such:

我最初的尝试是这样的:

df[sample(nrow(df[df$1a == TRUE & df$2a == TRUE,]), 3), ]

df [样本(nrow(df [df $ 1a == TRUE&df $ 2a == TRUE,]),3),]

Which gives the output.

这给出了输出。

      1a    1b    1c    1d    2a    2b    2c    2d
1136 FALSE FALSE FALSE  TRUE FALSE  TRUE FALSE FALSE
1021  TRUE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE
589  FALSE  TRUE FALSE FALSE FALSE  TRUE FALSE FALSE

It's selecting rows which are FALSE for 1a and 2a. What am I doing wrong? Thank you very much.

它选择1a和2a为FALSE的行。我究竟做错了什么?非常感谢你。

1 个解决方案

#1


2  

This piece of code

这段代码

df[df$1a == TRUE & df$2a == TRUE,]

should return 0 rows because there are no such cases.

应该返回0行,因为没有这种情况。

If your data frame has more than those 10 lines try to use TRUE as character:

如果您的数据框超过这10行,请尝试使用TRUE作为字符:

df[sample(nrow(df[df$1a == "TRUE" & df$2a == "TRUE",]), 3), ]

#1


2  

This piece of code

这段代码

df[df$1a == TRUE & df$2a == TRUE,]

should return 0 rows because there are no such cases.

应该返回0行,因为没有这种情况。

If your data frame has more than those 10 lines try to use TRUE as character:

如果您的数据框超过这10行,请尝试使用TRUE作为字符:

df[sample(nrow(df[df$1a == "TRUE" & df$2a == "TRUE",]), 3), ]