如何使每一行的k行在R数据帧的每个方向上满足给定的条件?

时间:2022-01-19 07:21:57

dplyr solutions are preferred.

dplyr解决方案者优先。

Let's say I have the following data:

假设我有以下数据:

library(tibble)

frame_data(
~a, ~b, ~c, ~d, ~e,
1, 2, 3, 4, FALSE,
5, 6, 7,8, TRUE,
9, 10, 11, 12, TRUE,
13, 14, 15, 16, FALSE,
17, 18, 19, 20, FALSE,
21, 22, 23, 24, FALSE,
25, 26, 27, 28, TRUE,
29, 30, 31, 32, FALSE,
33, 34, 35, 36, FALSE,
37, 38, 39, 40, FALSE
)

I'm looking to extract rows where the value in e is TRUE, and then also to extract a window of the k rows surrounding the row where e is TRUE in both directions, regardless of the value in e. For example, if k=1, I want:

我想提取e中值为真的行,然后提取围绕这一行的k行窗口,其中e在两个方向上都为真,而不考虑e中的值。

frame_data(
1, 2, 3, 4, FALSE,
5, 6, 7,8, TRUE,
9, 10, 11, 12, TRUE,
13, 14, 15, 16, FALSE,
21, 22, 23, 24, FALSE,
25, 26, 27, 28, TRUE,
29, 30, 31, 32, FALSE
)

and if k=2, I want:

如果k=2,我想:

frame_data(
~a, ~b, ~c, ~d, ~e,
1, 2, 3, 4, FALSE,
5, 6, 7,8, TRUE,
9, 10, 11, 12, TRUE,
13, 14, 15, 16, FALSE,
17, 18, 19, 20, FALSE,
21, 22, 23, 24, FALSE,
25, 26, 27, 28, TRUE,
29, 30, 31, 32, FALSE,
33, 34, 35, 36, FALSE
)

1 个解决方案

#1


1  

Here is a potential solution:

这里有一个潜在的解决方案:

#selection window size
k<-1

#find row numbers
foundrows<-which(dat$e)
#create row index based on found row +- window size
selectedRows<-unlist(lapply(foundrows, function(z){seq(z-k, z+k)}))
#remove overlaps and out of bounds subscripts 
selectedRows<-sort(unique(selectedRows))
selectedRows<-selectedRows[selectedRows>0 & selectedRows<=nrow(dat)]

dat[selectedRows,]

Not quite as straight forward as using the lat/lead function but it does allow for easy adjustment of the window size. It uses base R and will limit the row index to stay within the bounds for the dataframe.

虽然没有使用lat/lead函数那么直接,但它确实允许对窗口大小进行简单的调整。它使用基数R,并将限制行索引在dataframe的范围内。

#1


1  

Here is a potential solution:

这里有一个潜在的解决方案:

#selection window size
k<-1

#find row numbers
foundrows<-which(dat$e)
#create row index based on found row +- window size
selectedRows<-unlist(lapply(foundrows, function(z){seq(z-k, z+k)}))
#remove overlaps and out of bounds subscripts 
selectedRows<-sort(unique(selectedRows))
selectedRows<-selectedRows[selectedRows>0 & selectedRows<=nrow(dat)]

dat[selectedRows,]

Not quite as straight forward as using the lat/lead function but it does allow for easy adjustment of the window size. It uses base R and will limit the row index to stay within the bounds for the dataframe.

虽然没有使用lat/lead函数那么直接,但它确实允许对窗口大小进行简单的调整。它使用基数R,并将限制行索引在dataframe的范围内。