R的arima中的错误:很少有非遗漏的观测。

时间:2020-12-31 19:20:51

I am using arima() and auto.arima() of R to get the prediction of sales. The data is at week level for three years.

我使用arima()和auto.arima()来预测销售情况。该数据为三年来的每周水平。

my code looks like:

我的代码看起来像:

x<-c(1571,1501,895,1335,2306,930,2850,1380,975,1080,990,765,615,585,838,555,1449,615,705,465,165,630,330,825,555,720,615,360,765,1080,825,525,885,507,884,1230,342,615,1161, 1585,723,390,690,993,1025,1515,903,990,1510,1638,1461.67,1082,1075,2315,1014,2140,1572,794,1363,1184,1248,1344,1056,816,720,896,608,624,560,512,304,640,640,704,1072,768, 816,640,272,1168,736,1003,864,658.67,768,841,1727,944,848,432,704,850.67,1205,592,1104,976,629,814,1626,933.33,1100.33,1730,2742,1552,1038,826,1888,1440,1372,824,1824,1392,1424,768,464, 960,320,384,512,478,1488,384,338.67,176,624,464,528,592,288,544,418.67,336,752,400,1232,477.67,416,810.67,1256,1040,823,240,1422,704,718,1193,1541,1008,640,752, 1008,864,1507,4123,2176,899,1717,935)

x < - c(1571,1501,895,1571,1501,930,2850,1380,975,1080,990765615585838555,1449,615705465165630330825555720615360765,1080,825525885507884,1230,342615,1161,1585,723390690993,1025,1515,903990,1510,1638,1461.67,1082,1075,2315,1014,2140,1572,794,1363,1184,1248,1344,1056,816720896608624560512304640640704,1072,768,.67,768,841 816640272、1168、816640272、1168、816640272、1727、944848432704850 .67,1205,592,1104,976629814,1626,933.33,1100.33,1730,2742,1552,1038,826,1888,1440,1372,824,1824,1392,1424,768464,960320384512478,1488,384338 .67,176,624,464,528,592,288,544,418.67,336,752,400,1232年477.67,416,810.67,1256,1040,823240,1422,704718,1193,1541,1008,640752,1008,864,1507,4123,2176,899,1717,1717)

length_data<-length(x)

length_data <长度(x)< p>

length_train<-round(length_data*0.80)

length_train <轮(length_data * 0.80)< p>

forecast_period<-length_data-length_train

forecast_period < -length_data-length_train

train_data<-x[1:length_train]

train_data < - x(1:length_train)

train_data<-ts(train_data,frequency=52,start=c(1,1))

train_data < ts(train_data频率= 52岁,开始= c(1,1))

validation_data<-x[(length_train+1):length_data]

validation_data < - x[(length_train + 1):length_data]

validation_data<-ts(validation_data,frequency=52,start=c(ceiling((length_train)/52),((length_train)%%52+1)))

validation_data < ts(validation_data频率= 52岁,开始= c(上限((length_train)/ 52),(52(length_train)% % + 1)))

arima_output<-auto.arima(train_data) # fit the ARIMA Model

arima_output<-auto.arima(train_data) #符合ARIMA模型。

arima_validate <- Arima(x=validation_data,model=arima_output)

arima_validate < - Arima(x = validation_data,模型= arima_output)

Error:

错误:

Error in stats::arima(x = x, order = order, seasonal = seasonal, include.mean = include.mean, :

数据的错误::arima(x = x, order = order,季节性=季节性,包括。意味着=包括。的意思是:

too few non-missing observations

non-missing观察太少

What I am doing wrong? What does it mean by "too few non-missing observations"? I have searched it now net, but did not get any better explanation.

我做错了什么?这是什么意思,“太少的不缺少的观察”?我已经查过了,但没有得到更好的解释。

Thanks for any kind of help!

谢谢你的帮助!

2 个解决方案

#1


1  

arima_output is a seasonal ARIMA model:

arima_output是一个季节性的ARIMA模型:

> arima_output
Series: train_data 
ARIMA(1,0,1)(0,1,0)[52]

Arima() then attempts to refit this particular model to validation_data. But to fit a seasonal model to a time series, you need at least one full year of observations, since seasonal ARIMA depends on seasonal differencing.

Arima()然后尝试将这个特定的模型修改为validation_data。但是为了将季节模型与时间序列相适应,你需要至少一年的观察,因为季节性的ARIMA依赖于季节差异。

As an illustration, note that Arima() will happily and without errors refit a time series that is double as long as validation_data:

作为一个说明,请注意Arima()将会很高兴并且没有错误地将一个时间序列重新调整为validation_data的两倍。

validation_data <- x[(length_train+1):length_data]
validation_data<-ts(rep(validation_data,2),frequency=52,
  start=c(ceiling((length_train)/52),((length_train)%%52+1)))
arima_validate <- Arima(x=validation_data,model=arima_output)

One way of dealing with this would be to force auto.arima() to use a nonseasonal model, by specifying D=0:

处理这一问题的一种方法是强制auto.arima()使用非季节模型,通过指定D=0:

validation_data <- x[(length_train+1):length_data]
validation_data<-ts(validation_data,frequency=52,
  start=c(ceiling((length_train)/52),((length_train)%%52+1)))
arima_output<-auto.arima(train_data, D=0) # fit the ARIMA Model
arima_validate <- Arima(x=validation_data,model=arima_output)

So this did turn out to be more of a CrossValidated question...

所以这确实是一个交叉验证的问题…

#2


1  

Your chosen model is ARIMA(1,0,1)(0,1,0)[52]. That is, it has a seasonal difference of lag 52. Your validation data has 32 observations. So you cannot take the seasonal differences on the validation data without knowing what the training data is.

您所选择的模型是ARIMA(1,0,1)(0,1,0)[52]。也就是说,它的时滞差为52。您的验证数据有32个观察值。因此,在不了解培训数据的情况下,您不能对验证数据进行季节性差异。

One way around this is to fit the model to the full time series, and then extract what you want (presumably residuals from the validation portion).

一种方法是将模型与完整的时间序列相匹配,然后提取您想要的内容(从验证部分可以推测出剩余部分)。

You can also improve the readability of your code:

您还可以提高代码的可读性:

x <- ts(x, frequency=52, start=c(1,1))
length_data <- length(x)
length_train <- round(length_data*0.80)
train_data <- ts(head(x, length_train), 
                  frequency=frequency(x), start=start(x))
validation_data <- ts(tail(x, length_data-length_train), 
                  frequency=frequency(x), end=end(x))

library(forecast)
arima_train <- auto.arima(train_data) 
arima_full <- Arima(x, model=arima_train)
res <- window(residuals(arima_full), start=start(validation_data))

#1


1  

arima_output is a seasonal ARIMA model:

arima_output是一个季节性的ARIMA模型:

> arima_output
Series: train_data 
ARIMA(1,0,1)(0,1,0)[52]

Arima() then attempts to refit this particular model to validation_data. But to fit a seasonal model to a time series, you need at least one full year of observations, since seasonal ARIMA depends on seasonal differencing.

Arima()然后尝试将这个特定的模型修改为validation_data。但是为了将季节模型与时间序列相适应,你需要至少一年的观察,因为季节性的ARIMA依赖于季节差异。

As an illustration, note that Arima() will happily and without errors refit a time series that is double as long as validation_data:

作为一个说明,请注意Arima()将会很高兴并且没有错误地将一个时间序列重新调整为validation_data的两倍。

validation_data <- x[(length_train+1):length_data]
validation_data<-ts(rep(validation_data,2),frequency=52,
  start=c(ceiling((length_train)/52),((length_train)%%52+1)))
arima_validate <- Arima(x=validation_data,model=arima_output)

One way of dealing with this would be to force auto.arima() to use a nonseasonal model, by specifying D=0:

处理这一问题的一种方法是强制auto.arima()使用非季节模型,通过指定D=0:

validation_data <- x[(length_train+1):length_data]
validation_data<-ts(validation_data,frequency=52,
  start=c(ceiling((length_train)/52),((length_train)%%52+1)))
arima_output<-auto.arima(train_data, D=0) # fit the ARIMA Model
arima_validate <- Arima(x=validation_data,model=arima_output)

So this did turn out to be more of a CrossValidated question...

所以这确实是一个交叉验证的问题…

#2


1  

Your chosen model is ARIMA(1,0,1)(0,1,0)[52]. That is, it has a seasonal difference of lag 52. Your validation data has 32 observations. So you cannot take the seasonal differences on the validation data without knowing what the training data is.

您所选择的模型是ARIMA(1,0,1)(0,1,0)[52]。也就是说,它的时滞差为52。您的验证数据有32个观察值。因此,在不了解培训数据的情况下,您不能对验证数据进行季节性差异。

One way around this is to fit the model to the full time series, and then extract what you want (presumably residuals from the validation portion).

一种方法是将模型与完整的时间序列相匹配,然后提取您想要的内容(从验证部分可以推测出剩余部分)。

You can also improve the readability of your code:

您还可以提高代码的可读性:

x <- ts(x, frequency=52, start=c(1,1))
length_data <- length(x)
length_train <- round(length_data*0.80)
train_data <- ts(head(x, length_train), 
                  frequency=frequency(x), start=start(x))
validation_data <- ts(tail(x, length_data-length_train), 
                  frequency=frequency(x), end=end(x))

library(forecast)
arima_train <- auto.arima(train_data) 
arima_full <- Arima(x, model=arima_train)
res <- window(residuals(arima_full), start=start(validation_data))