用R计算季节性均值的最优雅方法是什么?

时间:2021-12-28 20:21:34

I have evenly spaces timeseries with daily mean observational data.

我使用每日平均观察数据均匀地分隔时间序列。

How do I compute seasonal means the easiest way? The seasons should follow the meteorological nomenclature with DJF (=winter: December, January, February), MAM, JJA, and SON.

如何计算季节性意味着最简单的方法?季节应遵循DJF(=冬季:12月,1月,2月),MAM,JJA和SON的气象命名。

That means December values comes from the year x-1.

这意味着12月的价值来自x-1年。

The calculation of monthly means is nicely presented here: How to calculate a monthly mean?

这里很好地展示了每月平均值的计算:如何计算月平均值?

It is possible to follow this idea when computing seasonal means. However, several caveats makes it not very transparent and one has to be careful!

在计算季节性手段时可以遵循这个想法。但是,有几点需要注意,它不是很透明,必须要小心!

I also dealt with a small part of this issue already in a former thread: How to switch rows in R?

我还在前一个帖子中处理过这个问题的一小部分:如何在R中切换行?

Here is now the complete story:

现在是完整的故事:

0: make a random time series

0:制作随机时间序列

ts.pdsi <- data.frame(date = seq(
                from=as.Date("1901-01-01"), 
                to=as.Date("2009-12-31"), 
                by="day"))
ts.pdsi$scPDSI <- rnorm(dim(ts.foo)[1],  mean=1, sd=1)    # add some data

1st: use the seas package and add seasons to your timeseries, which has to be formatted as a data.frame.

1st:使用海洋包装并为您的时间序列添加季节,必须将其格式化为data.frame。

library(seas)
# add moth/seasons
ts.pdsi$month  <- mkseas(ts.pdsi,"mon")   # add months
ts.pdsi$seas <- mkseas(ts.pdsi,"DJF")     # add seasons
ts.pdsi$seasyear <- paste(format(ts.pdsi[,1],"%Y"), 
                          ts.pdsi$seas ,sep="")   # add seasyears, e.g. 1950DJF

this gives

这给了

> head(ts.pdsi)
    date      scPDSI month seas seasyear
1 1901-01-01 -0.10881074   Jan  DJF  1901DJF
2 1901-02-01 -0.22287750   Feb  DJF  1901DJF
3 1901-03-01 -0.12233192   Mär  MAM  1901MAM
4 1901-04-01 -0.04440915   Apr  MAM  1901MAM
5 1901-05-01 -0.36334082   Mai  MAM  1901MAM
6 1901-06-01 -0.52079030   Jun  JJA  1901JJA

2nd: You can then calculate the seasonal means, following the above mentioned approach using the column $seasyear

第二:您可以使用$ seasyear列按照上述方法计算季节性均值

> MEAN <- tapply(pdsi$scPDSI, ts.pdsi$seasyear, mean, na.rm = T)
> head(MEAN)
1901DJF     1901JJA     1901MAM     1901SON     1902DJF     1902JJA 
-0.45451556 -0.72922229 -0.17669396 -1.12095590 -0.86523850 -0.04031273 

NOTE: spring (MAM) and summer (JJA) are switched due to strictley alphabetical sorting.

注意:弹簧(MAM)和夏季(JJA)由于严格的字母排序而被切换。

3rd: switch it back

第三:切换回来

foo <- MEAN
for(i in 1:length(MEAN)) {
    if (mod (i,4) == 2) {
        foo[i+1] <- foo[i]    #switch 2nd 3rd row (JJA <-> MAM)
        foo[i] <- MEAN[i+1]
    }
}
# and generate new names for the array
d <- data.frame(date=seq(from=as.Date("1901-01-01"), to=as.Date("2009-12-31"), by="+3 month"))
d$seas <- mkseas(d,"DJF") 
d$seasyear <- paste(format(d[,1],"%Y"), d$seas ,sep="")
names(foo)<-d$seasyear  # add right order colnames
MEAN <-foo

Finally, this results in a time series of seasonal means. Well, I fid it too complicated and i guess there are much easier solutions around.

最后,这导致了季节性手段的时间序列。好吧,我把它弄得太复杂了,我想有更简单的解决方案。

Additionally, this solution has also a really major problem with the winter season DJF: The December is so far not choosen from the year before. This is rather easy to fix (I guess), but makes the given way eve more complicated.

此外,这个解决方案也是冬季DJF的一个非常重要的问题:到目前为止,12月份还没有选择去年。这很容易修复(我猜),但使得给定的方式前夕更复杂。

I really hope there are better ideas around!

我真的希望有更好的想法!

3 个解决方案

#1


3  

I this what you want?

我这就是你想要的?

# # create some data: daily values for three years
df <- data.frame(date = seq(from = as.Date("2007-01-01"),
                            to = as.Date("2009-12-31"),
                            by = "day"))
df$vals <- rnorm(nrow(df))

# add year
df$year <- format(df$date, "%Y")

# add season
df$seas <- mkseas(x = df, width = "DJF")

# calculate mean per season within each year
df2 <- aggregate(vals ~ seas + year, data = df, mean)

df2
#    seas year         vals
# 1   DJF 2007 -0.048407610
# 2   MAM 2007  0.086996842
# 3   JJA 2007  0.013864555
# 4   SON 2007 -0.081323367
# 5   DJF 2008  0.170887946
# 6   MAM 2008  0.147830260
# 7   JJA 2008  0.003008866
# 8   SON 2008 -0.057974215
# 9   DJF 2009 -0.043437437
# 10  MAM 2009 -0.048345979
# 11  JJA 2009  0.023860506
# 12  SON 2009 -0.060076870

Because mkseas converts the dates into a seasonal factor with levels in the desired order, the order is correct also after the aggregation over year and season.

由于mkseas将日期转换为具有所需顺序级别的季节性因子,因此在年度和季节的汇总之后,订单也是正确的。

#2


1  

It's probably easier if you use numbers rather than strings for months and seasons, at least at first. You can get the seasons you want by simple arithmetic manipulations, including making December part of the subsequent year.

如果你在数月和季节使用数字而不是字符串,这可能会更容易,至少在开始时如此。您可以通过简单的算术操作获得您想要的季节,包括使12月成为下一年的一部分。

pdsi <- data.frame(date = seq(
            from=as.Date("1901-01-01"), 
            to=as.Date("2009-12-31"), 
            by="day"))
pdsi$scPDSI <- rnorm(nrow(pdsi),  mean=1, sd=1)
pdsi$mon<-mon(pdsi$date)+1
pdsi$seas<-floor((pdsi$mon %% 12)/3)+1
pdsi$year<-year(pdsi$date)+1900
pdsi$syear<-pdsi$year
pdsi$syear[pdsi$mon==12]<-pdsi$syear[pdsi$mon==12]+1

To compute seasonal means, you can simply do this:

要计算季节性方法,您可以简单地执行此操作:

meanArray<-tapply(pdsi$scPDSI,list(year=pdsi$syear,seas=pdsi$seas),mean)

And now you have

现在你有了

>head(meanArray)
      seas
year           1         2         3         4
  1901 1.0779676 1.0258306 1.1515175 0.9682434
  1902 0.9900312 0.8964994 1.1028336 1.0074296
  1903 0.9912233 0.9858088 1.1346901 1.0569518
  1904 0.7933653 1.1566892 1.1223454 0.8914211
  1905 1.1441863 1.1824074 0.9044940 0.8971485
  1906 0.9900826 0.9933909 0.9185972 0.8922987

If you want it as a flat array, with appropriate names, you first take the transpose, and then flatten the array, and add the names

如果你想将它作为一个具有适当名称的平面数组,你首先进行转置,然后展平数组,并添加名称

colnames(meanArray)<-c("DJF","MAM","JJA","SON")
meanArray<-t(meanArray)
MEAN<-array(meanArray)
names(MEAN)<-paste(colnames(meanArray)[col(meanArray)],rownames(meanArray)[row(meanArray)],sep="")

This gets you get the result you wanted

这可以让你得到你想要的结果

> head(MEAN)
  1901DJF   1901MAM   1901JJA   1901SON   1902DJF   1902MAM 
1.0779676 1.0258306 1.1515175 0.9682434 0.9900312 0.8964994  

#3


0  

As noted, there can be very simple solutions (also posted here). I would use a combination of the zoo and seas packages to aggregate by season, looking something like this:

如上所述,可以有非常简单的解决方案(也在这里发布)。我将使用动物园和海洋包的组合按季节聚合,看起来像这样:

library(zoo); library(seas)

seasTS <- aggregate(dataTS, mkseas(x=time(dataTS),width="DJF"), sum)

To do this for each year, simply loop over mkseas() by year. I'll have my coffee with a little syntactic sugar, please.

要为每年执行此操作,只需按年份循环遍历mkseas()。请给我带一点糖的咖啡。

Cheers,

干杯,

Adam

亚当

#1


3  

I this what you want?

我这就是你想要的?

# # create some data: daily values for three years
df <- data.frame(date = seq(from = as.Date("2007-01-01"),
                            to = as.Date("2009-12-31"),
                            by = "day"))
df$vals <- rnorm(nrow(df))

# add year
df$year <- format(df$date, "%Y")

# add season
df$seas <- mkseas(x = df, width = "DJF")

# calculate mean per season within each year
df2 <- aggregate(vals ~ seas + year, data = df, mean)

df2
#    seas year         vals
# 1   DJF 2007 -0.048407610
# 2   MAM 2007  0.086996842
# 3   JJA 2007  0.013864555
# 4   SON 2007 -0.081323367
# 5   DJF 2008  0.170887946
# 6   MAM 2008  0.147830260
# 7   JJA 2008  0.003008866
# 8   SON 2008 -0.057974215
# 9   DJF 2009 -0.043437437
# 10  MAM 2009 -0.048345979
# 11  JJA 2009  0.023860506
# 12  SON 2009 -0.060076870

Because mkseas converts the dates into a seasonal factor with levels in the desired order, the order is correct also after the aggregation over year and season.

由于mkseas将日期转换为具有所需顺序级别的季节性因子,因此在年度和季节的汇总之后,订单也是正确的。

#2


1  

It's probably easier if you use numbers rather than strings for months and seasons, at least at first. You can get the seasons you want by simple arithmetic manipulations, including making December part of the subsequent year.

如果你在数月和季节使用数字而不是字符串,这可能会更容易,至少在开始时如此。您可以通过简单的算术操作获得您想要的季节,包括使12月成为下一年的一部分。

pdsi <- data.frame(date = seq(
            from=as.Date("1901-01-01"), 
            to=as.Date("2009-12-31"), 
            by="day"))
pdsi$scPDSI <- rnorm(nrow(pdsi),  mean=1, sd=1)
pdsi$mon<-mon(pdsi$date)+1
pdsi$seas<-floor((pdsi$mon %% 12)/3)+1
pdsi$year<-year(pdsi$date)+1900
pdsi$syear<-pdsi$year
pdsi$syear[pdsi$mon==12]<-pdsi$syear[pdsi$mon==12]+1

To compute seasonal means, you can simply do this:

要计算季节性方法,您可以简单地执行此操作:

meanArray<-tapply(pdsi$scPDSI,list(year=pdsi$syear,seas=pdsi$seas),mean)

And now you have

现在你有了

>head(meanArray)
      seas
year           1         2         3         4
  1901 1.0779676 1.0258306 1.1515175 0.9682434
  1902 0.9900312 0.8964994 1.1028336 1.0074296
  1903 0.9912233 0.9858088 1.1346901 1.0569518
  1904 0.7933653 1.1566892 1.1223454 0.8914211
  1905 1.1441863 1.1824074 0.9044940 0.8971485
  1906 0.9900826 0.9933909 0.9185972 0.8922987

If you want it as a flat array, with appropriate names, you first take the transpose, and then flatten the array, and add the names

如果你想将它作为一个具有适当名称的平面数组,你首先进行转置,然后展平数组,并添加名称

colnames(meanArray)<-c("DJF","MAM","JJA","SON")
meanArray<-t(meanArray)
MEAN<-array(meanArray)
names(MEAN)<-paste(colnames(meanArray)[col(meanArray)],rownames(meanArray)[row(meanArray)],sep="")

This gets you get the result you wanted

这可以让你得到你想要的结果

> head(MEAN)
  1901DJF   1901MAM   1901JJA   1901SON   1902DJF   1902MAM 
1.0779676 1.0258306 1.1515175 0.9682434 0.9900312 0.8964994  

#3


0  

As noted, there can be very simple solutions (also posted here). I would use a combination of the zoo and seas packages to aggregate by season, looking something like this:

如上所述,可以有非常简单的解决方案(也在这里发布)。我将使用动物园和海洋包的组合按季节聚合,看起来像这样:

library(zoo); library(seas)

seasTS <- aggregate(dataTS, mkseas(x=time(dataTS),width="DJF"), sum)

To do this for each year, simply loop over mkseas() by year. I'll have my coffee with a little syntactic sugar, please.

要为每年执行此操作,只需按年份循环遍历mkseas()。请给我带一点糖的咖啡。

Cheers,

干杯,

Adam

亚当