New version page

# CUHK- Shenzhen STOR 556 - Forecasting

Course: Stor 556-
Pages: 11
Documents in this Course

4 pages

4 pages

## This preview shows page 1-2-3-4 out of 11 pages.

View Full Document
Do you want full access? Go Premium and unlock all 11 pages.
Do you want full access? Go Premium and unlock all 11 pages.
Do you want full access? Go Premium and unlock all 11 pages.
Do you want full access? Go Premium and unlock all 11 pages.

Unformatted text preview:

What is this about?Some simple forecasting methodsEvaluating forecast accuracyMore on cross-validationSome other forecasting methods: Exponential smoothingForecast combinationsReadingReferencesMore on ForecastingVladas Pipiras, STOR @ UNC-CHMarch, 2022What is this about?We have looked at a few forecasting methods, e.g. based on ARIMA models. Here:• A few other forecasting methods, including more naïve.• How does one think about forecast accuracy?• Which method to use? And perhaps not to use?Some simple forecasting methodsAverage methodˆyT +h|T= ¯y = (y1+ · · · + yT)/T.R: meanf(y, h)Naïve methodˆyT +h|T= yT.R: naive(y, h) or rwf(y, h)Drift methodˆyT +h|T= yT+hT − 1TXt=2(yt− yt−1) = yT+ hyT− y1T − 1.R: rwf(y, h, drift=TRUE)Examplelibrary(fpp2)beer2 <- window(ausbeer,start=1992,end=c(2007,4))beerfit1 <- meanf(beer2,h=10)beerfit2 <- rwf(beer2,h=10)beerfit3 <- snaive(beer2,h=10)autoplot(window(ausbeer, start=1992)) +autolayer(beerfit1, series="Mean", PI=FALSE) +autolayer(beerfit2, series="Naïve", PI=FALSE) +autolayer(beerfit3, series="Seasonal naïve", PI=FALSE) +xlab("Year") + ylab("Megalitres") +ggtitle("Forecasts for quarterly beer production") +guides(colour=guide_legend(title="Forecast"))14004505001995 2000 2005 2010YearMegalitresForecastMeanNaïveSeasonal naïveForecasts for quarterly beer productionEvaluating forecast accuracyThe accuracy of forecasts can be determined by considering not model residuals but only how well a modelperforms on new data that were not used when fitting the model.Training and test sets• A model which fits the training data well will not necessarily forecast well.• A perfect fit can always be obtained by using a model with enough parameters.• Over-fitting a model to data is just as bad as failing to identify a systematic pattern in the data.Measures of forecast accuracy are based on:Forecast errorseT +h= yT +h− ˆyT +h|TScale-dependent errorsMean absolute error: MAE = mean(|et|),Root mean squared error: RMSE =qmean(e2t).Percentage errors The percentage error is given by pt= 100et/yt.Mean absolute percentage error: MAPE = mean(|pt|).Scaled errorsMASE = mean(|qj|),where for a non-seasonal timme seriesqj=ej1T − 1TXt=2|yt− yt−1|2and for a seasonal time seriesqj=ej1T − mTXt=m+1|yt− yt−m|.Examplebeer3 <- window(ausbeer, start=2008)accuracy(beerfit1, beer3)## ME RMSE MAE MPE MAPE MASE ACF1## Training set 0.000 43.62858 35.23438 -0.9365102 7.886776 2.463942 -0.10915105## Test set -13.775 38.44724 34.82500 -3.9698659 8.283390 2.435315 -0.06905715## Theil's U## Training set NA## Test set 0.801254accuracy(beerfit2, beer3)## ME RMSE MAE MPE MAPE MASE## Training set 0.4761905 65.31511 54.73016 -0.9162496 12.16415 3.827284## Test set -51.4000000 62.69290 57.40000 -12.9549160 14.18442 4.013986## ACF1 Theil's U## Training set -0.24098292 NA## Test set -0.06905715 1.254009accuracy(beerfit3, beer3)## ME RMSE MAE MPE MAPE MASE ACF1## Training set -2.133333 16.78193 14.3 -0.5537713 3.313685 1.0000000 -0.2876333## Test set 5.200000 14.31084 13.4 1.1475536 3.168503 0.9370629 0.1318407## Theil's U## Training set NA## Test set 0.298728Another examplegoogfc1 <- meanf(goog200, h=40)googfc2 <- rwf(goog200, h=40)googfc3 <- rwf(goog200, drift=TRUE, h=40)autoplot(subset(goog, end = 240)) +autolayer(googfc1, PI=FALSE, series="Mean") +autolayer(googfc2, PI=FALSE, series="Naïve") +autolayer(googfc3, PI=FALSE, series="Drift") +xlab("Day") + ylab("Closing Price (US\$)") +ggtitle("Google stock price (daily ending 6 Dec 13)") +guides(colour=guide_legend(title="Forecast"))34004505005500 50 100 150 200 250DayClosing Price (US\$)ForecastDriftMeanNaïveGoogle stock price (daily ending 6 Dec 13)googtest <- window(goog, start=201, end=240)accuracy(googfc1, googtest)## ME RMSE MAE MPE MAPE MASE## Training set -4.296286e-15 36.91961 26.86941 -0.6596884 5.95376 7.182995## Test set 1.132697e+02 114.21375 113.26971 20.3222979 20.32230 30.280376## ACF1 Theil's U## Training set 0.9668981 NA## Test set 0.8104340 13.92142accuracy(googfc2, googtest)## ME RMSE MAE MPE MAPE MASE## Training set 0.6967249 6.208148 3.740697 0.1426616 0.8437137 1.000000## Test set 24.3677328 28.434837 24.593517 4.3171356 4.3599811 6.574582## ACF1 Theil's U## Training set -0.06038617 NA## Test set 0.81043397 3.451903accuracy(googfc3, googtest)## ME RMSE MAE MPE MAPE MASE## Training set -5.998536e-15 6.168928 3.824406 -0.01570676 0.8630093 1.022378## Test set 1.008487e+01 14.077291 11.667241 1.77566103 2.0700918 3.119002## ACF1 Theil's U## Training set -0.06038617 NA## Test set 0.64732736 1.709275Time series cross-validationA more sophisticated version of training/test sets is time series cross-validation.4The forecast accuracy is computed by averaging over the test sets.e <- tsCV(goog200, rwf, drift=TRUE, h=1)sqrt(mean(eˆ2, na.rm=TRUE))## [1] 6.233245sqrt(mean(residuals(rwf(goog200, drift=TRUE))ˆ2, na.rm=TRUE))## [1] 6.168928A good way to choose the best forecasting model is to find the model with the smallest RMSE computedusing time series cross-validation.More on cross-validationVariants of prequential approaches5Variants of cross validationSome observations from Cerqueira et al. (2020):• Empirical experiments suggest that blocked cross-validation can be applied to stationary time series.•When the time series are non-stationary, the most accurate estimates are produced by out-of-samplemethods, particularly the holdout approach repeated in multiple testing periods.Some other forecasting methods: Exponential smoothingSimple exponential smoothing (SES)For forecasting data with no clear trend or seasonal pattern:ˆyT +1|T= αyT+ α(1 − α)yT −1+ α(1 − α)2yT −2+ · · ·whereα ∈[0,1] is the smoothing parameter. This can be rewritten asˆyT +1|T=αyT+ (1− α)ˆyT |T −1andmore generally asˆyt+1|t= αyt+ (1 − α)ˆyt|t−1.This is also expressed in a component form asForecast equation ˆyt+h|t= `tSmoothing equation `t= αyt+ (1 − α)`t−1.The smoothing parameter α and the starting value `0are chosen to minimizeSSE =TXt=1(yt− ˆyt|t−1)2=TXt=1e2t.6Exampleoildata <- window(oil, start=1996)# Estimate parametersfc <- ses(oildata, h=5)fc\$model## Simple exponential smoothing#### Call:## ses(y = oildata, h = 5)#### Smoothing parameters:## alpha = 0.8339#### Initial states:## l = 446.5868#### sigma: 29.8282#### AIC AICc BIC## 178.1430 179.8573 180.8141fc## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95## 2014 542.6806 504.4541 580.9070 484.2183 601.1429## 2015

View Full Document