## Plotting forecast() objects in ggplot part 2: Visualize Observations, Fits, and Forecasts

In my last post I presented a function for extracting data from a forecast() object and formatting the data so that it can be plotted in ggplot. The scenario is that you are fitting a model to a time series object with training data, then forecasting out, and then visually evaluating the fit with the observations that your forecast tried to duplicate. Then you want a plot that includes: the original observations, the fitted values, the forecast values, and the observations in the forecast period. The function I presented in the last post extracts all that information in a nice ggplot ready data.frame. In this post I simulate data from an Arima process, fit an incorrect model, use the function from the last post to extract the data, and then plot in ggplot.

#----------Simulate an Arima (2,1,1) Process------------- library(forecast)

set.seed(1234) y<-arima.sim(model=list(order=c(2,1,1),ar=c(0.5,.3),ma=0.3),n=144) y<-ts(y,freq=12,start=c(2000,1)) #-- Extract Training Data, Fit the Wrong Model, and Forecast yt<-window(y,end=2009.99) yfit<-Arima(yt,order=c(1,0,1)) yfor<-forecast(yfit) #---Extract the Data for ggplot using funggcast() pd<-funggcast(y,yfor) #---Plot in ggplot2 0.9 library(ggplot2) library(scales) p1a<-ggplot(data=pd,aes(x=date,y=observed)) p1a<-p1a+geom_line(col='red') p1a<-p1a+geom_line(aes(y=fitted),col='blue') p1a<-p1a+geom_line(aes(y=forecast))+geom_ribbon(aes(ymin=lo95,ymax=hi95),alpha=.25) p1a<-p1a+scale_x_date(name='',breaks='1 year',minor_breaks='1 month',labels=date_format("%b-%y"),expand=c(0,0)) p1a<-p1a+scale_y_continuous(name='Units of Y') p1a<-p1a+opts(axis.text.x=theme_text(size=10),title='Arima Fit to Simulated Data\n (black=forecast, blue=fitted, red=data, shadow=95% conf. interval)') #p1a

Created by Pretty R at inside-R.org

## Reader Comments (5)

Hi while trying to execute the above code... It gave me an error while I was fitting the incorrect Arima model i.e. c(1,0,1). The error was: non-stationary AR part from CSS. This gets fixed if you change the method to "ML". So new code:

yfit<-Arima(yt,order=c(1,0,1),method="ML")

I wonder if you can explain this though ..... Thanks !

Hi Ankur- Thanks for the comment. I ran the code again a few times and was not able to reproduce the error but did get a warning on some occasions (but not all). My guess is that sometimes data that gets generated from

`arima.sim()`

has properties that do not lend themselves well to using conditional sum of squares to find the starting value for the maximum likelihood. Beyond that I don't know. I found that when I set the random number seed`set.seed(1234)`

the problem does not occur. My R session info is below, in case it has something to do with your version of R or forecast().R version 2.14.2 (2012-02-29)

Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)

`locale:`

[1] C

`attached base packages:`

[1] parallel stats graphics grDevices utils datasets methods

[8] base

`other attached packages:`

[1] forecast_3.19 RcppArmadillo_0.2.35 Rcpp_0.9.10

[4] fracdiff_1.4-0 tseries_0.10-28 zoo_1.7-7

[7] quadprog_1.5-4 rj_1.0.0-3

`loaded via a namespace (and not attached):`

[1] grid_2.14.2 lattice_0.20-0 rj.gd_1.0.0-1 tools_2.14.2

My husband is a progarmmer, so I'll show him this article and I know for sure he'll be pleased and use it for his project.

Great post!!!

I am new to time series and found your function to be very helpful when dealing with dates along the x axis in ggplot2. I have a question if you don't mind, how can I apply funggcast to be able to show weekly breaks along the x axis?

I am pretty impressed by the things that you tell here about the forecast, observations and fits.