I have a long sequence of 1s and 0s which represent bird incubation patterns, 1 being bird ON the nest, 0 being OFF.
我有一长串的1和0代表鸟类孵化模式,1代表鸟类在巢上,0代表关。
> Fake.data<- c(1,1,1,1,1,0,0,1,1,1,1,0,0,0,1,1,1,1,0,1,1,1,1,0,0,1,1,1,1,1,0,0,0,0,1,1,0,1,0)
As an end point I would essentially like a single value for the ratio between each ON period and the consecutive OFF period. So ideally this should be for Fake.data a vector like this
作为结束点,我想要一个单独的值表示每个周期和连续关闭周期之间的比率。理想情况下,这应该是假的。像这样的数据
[1] 0.4 0.75 0.25 0.5 0.8 0.5 1 (I just typed this out!)
So far I have split the vector into sections using split()
到目前为止,我已经使用split()将矢量分割成多个部分
> Diff<-diff(Fake.data)
> SPLIT<-split(Fake.data, cumsum(c(1, Diff > 0 )))
> SPLIT
Which returns...
它返回……
$`1`
[1] 1 1 1 1 1 0 0
$`2`
[1] 1 1 1 1 0 0 0
$`3`
[1] 1 1 1 1 0
$`4`
[1] 1 1 1 1 0 0
$`5`
[1] 1 1 1 1 1 0 0 0 0
$`6`
[1] 1 1 0
$`7`
[1] 1 0
So I can get the ratio for a single split group using
我可以得到一个分裂组的比值
> SPLIT$'1'<- ((length(SPLIT$'1'))-(sum(SPLIT$'1')))/sum(SPLIT$'1')
> SPLIT$'1'
[1] 0.4
However in my data I have some several thousand of these to do and would like to apply some sort of tapply() or for() loop to calculate this automatically for all and put it into a single vector. I have tried each of these methods with little success as the split() output structure does not seem to fit with these functions?
但是,在我的数据中,我有几千个这样的函数,我想应用某种tapply()或for()循环来自动地为所有函数计算这个函数,并将它放入一个单独的向量中。我尝试过这些方法,但都没有成功,因为split()输出结构似乎不适合这些函数?
I create a new vector to receive the for() loop output
我创建一个新的向量来接收for()循环输出
ratio<-rep(as.character(NA),(length(SPLIT)))
Then attempting the for() loop using the code above which work for a single run.
然后使用上面的代码尝试for()循环,该代码可用于一次运行。
for(i in SPLIT$'1':'7')
{ratio[i]<-((length(SPLIT$'[i]'))-(sum(SPLIT$'[i]')))/sum(SPLIT$'[i]')}
What I get is...
我得到的是……
[1] "NaN" "NaN" "NaN" "NaN" "NaN" "NaN" NA
[1]“NaN”“NaN”“NaN”“NaN”“NaN”“NaN”“NaN”NA
Tried many other variations along this theme but now just really stuck!
在这个主题上尝试了许多其他的变奏,但是现在真的被卡住了!
2 个解决方案
#1
3
I think you were very close with your stategy. The sapply
function is very happy to work with lists. I would just change the last step to
我认为你和你的身份很接近。sapply函数非常乐意使用列表。我把最后一步改成
sapply(SPLIT, function(x) sum(x==0)/sum(x==1))
which returns
它返回
1 2 3 4 5 6 7
0.40 0.75 0.25 0.50 0.80 0.50 1.00
with your sample data. No additional packages needed.
你的样本数据。不需要额外的软件包。
#2
1
Here are two possibiities:
这里有两个possibiities:
1) Compute the lengths using rle
and then in the if
statement if the data starts with 0 don't include the first length so we are assured that we are starting out with a 1. Finally compute the ratios using rollapply
from the zoo package:
1)用rle计算长度,然后在if语句中如果数据从0开始,不包括第一个长度,所以我们确信我们从1开始。最后使用zoo软件包中的rollapply计算比率:
library(zoo)
lengths <- rle(Fake.data)$lengths
if (Fake.data[1] == 0) lengths <- lengths[-1]
rollapply(lengths, 2, by = 2, function(x) x[2]/x[1])
giving:
给:
[1] 0.40 0.75 0.25 0.50 0.80 0.50 1.00
The if
line can be removed if we know that the data always starts with a 1.
如果我们知道数据总是以1开头,则可以删除if行。
2) If we can assume that the series always starts with a 1 and ends in a 0 then this one liner would work:
2)如果我们假设级数总是以1开始,以0结束,那么这一行就可以:
with( rle(Fake.data), lengths[values == 0] / lengths[values == 1] )
giving the same answer as above.
给出和上面一样的答案。
#1
3
I think you were very close with your stategy. The sapply
function is very happy to work with lists. I would just change the last step to
我认为你和你的身份很接近。sapply函数非常乐意使用列表。我把最后一步改成
sapply(SPLIT, function(x) sum(x==0)/sum(x==1))
which returns
它返回
1 2 3 4 5 6 7
0.40 0.75 0.25 0.50 0.80 0.50 1.00
with your sample data. No additional packages needed.
你的样本数据。不需要额外的软件包。
#2
1
Here are two possibiities:
这里有两个possibiities:
1) Compute the lengths using rle
and then in the if
statement if the data starts with 0 don't include the first length so we are assured that we are starting out with a 1. Finally compute the ratios using rollapply
from the zoo package:
1)用rle计算长度,然后在if语句中如果数据从0开始,不包括第一个长度,所以我们确信我们从1开始。最后使用zoo软件包中的rollapply计算比率:
library(zoo)
lengths <- rle(Fake.data)$lengths
if (Fake.data[1] == 0) lengths <- lengths[-1]
rollapply(lengths, 2, by = 2, function(x) x[2]/x[1])
giving:
给:
[1] 0.40 0.75 0.25 0.50 0.80 0.50 1.00
The if
line can be removed if we know that the data always starts with a 1.
如果我们知道数据总是以1开头,则可以删除if行。
2) If we can assume that the series always starts with a 1 and ends in a 0 then this one liner would work:
2)如果我们假设级数总是以1开始,以0结束,那么这一行就可以:
with( rle(Fake.data), lengths[values == 0] / lengths[values == 1] )
giving the same answer as above.
给出和上面一样的答案。