Time Series Analysis in R

Srijit Mukherjee
Srijit Mukherjee Content
3 min readMay 19, 2021

--

Download the data from here.

Load your data

Time Series is a specific data structure in R.

We have to convert data into time series data structures to apply time series algorithms.

Chile Data

library(dplyr)
data_chile = data %>% filter(Country == "Chile")
chile = data_chile[,c(2,5)]
head(chile)

Convert into time-series data structure*

  • 1,2,3,4,… as time index with no NA values.
library(zoo)
z <- read.zoo(chile, format = "%Y-%m-%d")#zoo series for dates
time(z) <- seq_along(time(z))#sequential data as 1,2,3,4,...
ts_chile = as.ts(z) #conversion into time series data structure
head(ts_chile)
  • Date as a time index with NA values to be removed.
library(zoo)
library(tseries)#for removing na from ts()
z <- read.zoo(chile, format = "%Y-%m-%d")#zoo series for dates
ts_chile = as.ts(z)
ts_chile = na.remove(ts_chile)#remove na from time series
head(ts_chile)

If you convert here, like this it will create weird numbers. I prefer to do it as 1,2,3,4,… time index.

Plot the time series

plot(ts_chile)

Two parts of the Time Series, we'll discuss are

  • Decomposition (Trend + Seasonal + Error)
  • Forecasting

Decomposition

decompose(ts_chile)
This is happening because the data has no seasonal trend

Forecast

Model (ARIMA model)

library(forecast)
model = auto.arima(ts_chile)
model

ACF Plot

acf(model$residuals, main = "Correlogram")

PACF Plot

pacf(model$residuals, main = "Partial Correlogram")

Ljung Box Test

The test determines whether or not errors are iid (i.e. white noise) or whether there is something more behind them; whether or not the autocorrelations for the errors or residuals are non zero.

Box.test(model$residuals, type = "Ljung-Box")

We do not reject the hypothesis, that the model shows a good fit.

Normality of Residuals

hist(model$residuals, freq = FALSE)
lines(density(model$residuals))

Forecast

forecast = forecast(model,4)#4 = number of units you want to 
library(ggplot2)
autoplot(forecast) #plot of the model
accuracy(forecast) #performance of the model
forecast #forecast values

Exercise: Apply it to other countries as well.

--

--

Srijit Mukherjee
Srijit Mukherjee Content

I can help you carve a career in statistics, data science, and machine learning. Know me more at https://www.linkedin.com/in/srijit-mukherjee/.