Difference between revisions of "SMHS TimeSeries"

From SOCR
Jump to: navigation, search
(Problems)
(Scientific Methods for Health Sciences - Time Series Analysis)
 
(15 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 
==[[SMHS| Scientific Methods for Health Sciences]] - Time Series Analysis ==
 
==[[SMHS| Scientific Methods for Health Sciences]] - Time Series Analysis ==
 +
 +
* See the [https://socr.umich.edu/DSPA2/DSPA2_notes/12_LongitudinalDataAnalysis.html '''DSPA2 Extensible Longitudinal Data Analysis Chapter'''].
 +
* [https://sda.statisticalcomputing.org/ The '''SOCR Statistical Data Analyzer (SDA)''' provides live app-based data and multiple tools for interactive time-varying data analytics].
  
 
===Overview===
 
===Overview===
Time series data is a sequence of data points measured at successive pints in time spaced intervals. Time series analysis is commonly used in varieties of studies like monitoring industrial processes and tracking business metrics in the business world. In this lecture, we will present a general introduction to the time series data and introduced on some of the most commonly used techniques in the rick and rapidly growing field of time series modeling and analysis to extract meaningful statistics and other characteristics of the data. We will also illustrate the application of time series analysis techniques with examples in R.  
+
Time series data is a sequence of data points measured at successive, equally spaced points in time. Time series analysis is utilized across various disciplines, such as monitoring industrial processes, tracking business metrics, and analyzing longitudinal health records. In this module, we present a comprehensive introduction to time series data and explore both classical and modern techniques used in this rapidly growing field. The goal is to extract meaningful statistics, identify underlying patterns (such as trends and seasonality), and construct predictive models. We will illustrate the application of these techniques using interactive examples in R.
  
 
===Motivation===
 
===Motivation===
Economic data like daily share price, monthly sales, annual income or physical data like daily temperature, ECG readings are all examples of time series data. Time series is just an ordered sequence of values of a variable measured at equally spaced time intervals. So, what would be the effective ways to measure time series data? How can we extract information from time series data and make inference afterwards?
+
Economic data (e.g., daily share prices, monthly sales, annual income) and biophysical data (e.g., daily temperature, ECG readings, epidemiological counts) are all examples of time series data. A time series is an ordered sequence of values of a variable measured at equally spaced time intervals. What are the most effective ways to model time series data? How can we extract actionable information, make inferences about underlying mechanisms, and forecast future observations? Answering these questions requires understanding the mathematical properties of time series processes and applying robust statistical modeling techniques.
  
 
===Theory===
 
===Theory===
Time series: a sequence of data points measured at successive pints in time spaced intervals.
 
*Components: (1) trend component: long-run increase or decrease over time (overall upward or downward movement), typically refers to data taken over a long period of time; the trend can be upward or downward, linear or non-linear. (2) seasonal component: short-term regular wave-like patterns, usually refers to data observed within one year, which may be measured monthly or quarterly. (3) cyclical component: long-term wave-like patterns; regularly occur but may vary in length; often measured peak to peak or trough to trough. (4) irregular component: unpredictable, random, ‘residual’ fluctuation, which may be due to random variations of nature or accidents or unusual events; ‘noise’ in the time series.
 
*Probabilistic models: The components still add or multiply together across time.
 
[[Image:SMHS Fig 1 Times Series Analysis.png|400px]]
 
*Simple Time Series Models:
 
**Take $T_{t}$ as the trend component, $S_{t}$ as the seasonal component, $C_{t}$ as the cyclical component and $I_{t}$ as the irregular or random component. Then we have an additive model as: $x_{t}=T_{t}+S_{t}+C_{t}+I_{t};$ the multiplicative model says $x_{t}=T_{t}*S_{t}*C_{t}*I_{t};$ sometimes, we take logs on both sides of the multiplicative model to make it additive, $logx_{t}=log\left ( T_{t}*S_{t}*C_{t}*I_{t} \right )$, which can be further noted as $x_{t}{}'=T_{t}{}'+S_{t}{}'+C_{t}{}'+I_{t}{}'$.
 
**Most time series models are written in terms of an (unobserved) white noise process, which is often assumed to be Gaussian: $x_{t}=w_{t}$, where $w_{t}\sim WN\left ( 0,1 \right )$, that is, $E\left ( W_{t} \right )=0, Var\left(W_{t} \right )=\sigma ^{2}$. Examples of probabilistic models include: (1) Autoregressive $x_{t}=0.6x_{t-1}+w_{t};$ (2) Moving average model $x_{t}= \frac{1}{3}\left(w_{t}+w_{t-1}+w_{t-2}\right).$
 
**Fitting time series models: a time series model generates a process whose pattern can then be matched in some way to the observed data; since perfect matches are impossible, it is possible that more than one model will be appropriate for a set of data; to decide which model is appropriate: patterns suggest choices, assess within sample adequacy (diagnostics, tests), outside sample adequacy (forecast evaluation), simulation from suggested model and compare with observed data; next, turn to some theoretical aspects like how to characterize a time series, and then investigate some special processes. 
 
**Characterizing time series (the mean and covariance of time series): suppose the data are $x_{1},x_{2},\cdots ,x_{t},$ note for a regularly spaced time series, $x_{1}$ is observed at time $t_{0}$, $x_{2}$ is observed at $t_{0}+\Delta, x_{t}$ is observed at $t_{0}+\left( t-1\right) \Delta$; the expected value of $x_{t}$ is $\mu _{t}=E\left [ x_{t} \right ]$; the covariance function is $\gamma \left(s,t \right )=cov\left(x_{s},x_{t} \right )$. Note that, we don’t distinguish between random variables and their realizations. Properly, $x_{1},x_{2},\cdots ,x_{t}$ is a realization of some process $X_{1},X_{2},\cdots ,X_{t}$ and so $\mu _{t}=E\left [ x_{t} \right ]$ etc.
 
**Characterizing time series data: weak stationarity: usually we have only one realization of the time series. There are not enough data to estimate all the means \& covariances. That is if the mean is constant, and the covariance depends only on the separation in time, then a time series is $\left(weakly\right)$ stationary. In this case  $\mu_{t}=\mu$ and $\gamma\left(s,t\right)=\gamma\left(t-s\right)$. Stationary processes are much easier to handle statistically. Real time series are usually non-stationary, but can often be transformed to give a stationary process.
 
**Estimation of $\gamma(h)$ (Estimating the autocovariance function): (1) for a stationary process: $\gamma(h)=\gamma(s,s+h)=cov(x_{s},x_{s+h})$ for any s.$\ \gamma (h)$ is called the autocovariance function because it measures the covariance between lags of the process $x_{t}$; (2) we observe T-h pairs separated at lag h namely $\left(x_{1},x_{h+1}\right),\cdots,\left(x_{T-h},x_{T}\right)$; the sample autocovariance function is $\hat{\gamma}(h)=\frac{1}{T}\sum_{t=1}^{T-h}\left(x_{t}-\bar{X} \right )\left(x_{t+h}-\bar{X} \right )$, note that we divide by ${T}$ although there are ${T-h}$ terms in the sum. The autocorrelation function ${ACF}$ is $\rho (h)=\frac{\gamma(h)}{\gamma(0)}$.
 
**$\rho(h)$ properties: for $x_{t}$ stationary, $\gamma(h)=E(x_{t}-\mu{t})(x_{t+h}-\mu)$, where $\gamma(h)$ is an even function, that is $\gamma(h)=\gamma(-h); \left | \gamma(h) \right |\leq \left | \gamma(0) \right |$, that is $\left|\rho(h)\right|\leq1, h=\pm 1,2,\cdots$. The autocorrelation matrix, $P(h)$ is positive semi-definite which means that the autocorrelations cannot be chosen arbitrarily but must have some relations among themselves.
 
**To study a set of time series data: given that we only have one realization, we need to match plot of data from proposed model with data (we can see broad/main patterns); match acf, pacf of proposed model with data (we can see if data are stationary uncorrelated, stationary autocorrelated or non-stationary); determine if the proposed model is appropriate; obtain forecasts from proposed model; compare across models that seem appropriate.
 
**Large sample distribution of the $ACF$ for a $WN$ series: if $x_{t}$ is $WN$ then for n large, the sample $ACF$, $\hat{\rho}_{x}(h), h=1,2,\cdots$,H where H is fixed but arbitrary is normally distributed with zero mean and standard deviation given by $\sigma_{\hat{\rho}_{x}}(h)=\frac{1}{\sqrt{h}}$. A rule of thumb for assessing whether sample autocorrelations resemble those for a $WN$ series is by checking if approximately 95% of the sample autocorrelations are within the interval $0\pm \frac{2}{\sqrt{n}}$.
 
**White noise, or stationary uncorrelated process: we say that a time series is white noise if it is weakly stationary, with $ACF$, $\rho(h)=\begin{cases} 1 & \text{ if } h=0 \\ 0 & \text{ if } h> 0\ and\ with \ E\left[w_{t} \right ]=0 \end{cases}$, i.e. white noise consists of a sequence of uncorrelated, zero mean random variables with common variance $\sigma^{2}$.
 
  
*Example of a simple moving average model in R: $v_{t}=\frac{1}{3} (w_{t-1}+w_{t}+w_{t+1})$.  
+
====Components of Time Series====
 +
A time series <math>x_t</math> can generally be decomposed into three primary components, plus an irregular error term:
 +
* '''Trend Component (<math>T_t</math>):''' A long-run increase or decrease over time (overall upward or downward movement). The trend can be linear or non-linear.
 +
* '''Seasonal Component (<math>S_t</math>):''' Short-term, regular wave-like patterns completing a cycle within a fixed period (e.g., daily, monthly, quarterly).
 +
* '''Cyclical Component (<math>C_t</math>):''' Long-term wave-like patterns with variable lengths, often measured peak-to-peak or trough-to-trough, typically tied to economic or biological cycles.
 +
* '''Irregular Component (<math>I_t</math>):''' Unpredictable, random residuals, or "noise," which may be due to random variations or unusual events.
  
R CODE:  
+
These components can combine additively or multiplicatively:
w <- ts(rnorm(150))  ## generate 150 data from the standard normal distribution
+
* '''Additive Model:''' <math>x_{t}=T_{t}+S_{t}+C_{t}+I_{t}</math>
v <- ts(filter(w, sides=2, rep(1/3, 3))) ## moving average model
+
* '''Multiplicative Model:''' <math>x_{t}=T_{t} \times S_{t} \times C_{t} \times I_{t}</math>
par(mfrow=c(2,1))
+
A logarithmic transformation can convert a multiplicative model into an additive one: <math>\log(x_{t})=\log(T_{t})+\log(S_{t})+\log(C_{t})+\log(I_{t})</math>.
plot(w,main='white noise')
+
* '''Probabilistic models''': The components still add or multiply together across time.
plot(v,main='moving average')
+
[[Image:SMHS Fig 1 Times Series Analysis.png|400px]]
  
or can use code:
+
====Stationarity====
 +
Most statistical forecasting methods assume that a time series is approximately stationary. A time series is said to be '''strictly stationary''' if its statistical properties are unaffected by a shift in time. In practice, we rely on '''weak stationarity''', which requires:
 +
1. The mean is constant over time: <math>E[x_t] = \mu</math> for all <math>t</math>.
 +
2. The variance is finite and constant: <math>Var(x_t) = \gamma(0) < \infty</math>.
 +
3. The autocovariance between any two points depends only on the time lag <math>h</math> between them: <math>Cov(x_t, x_{t+h}) = \gamma(h)</math> for all <math>t</math> and <math>h</math>.
  
w <- rnorm(150)  ## generate 150 data from the standard normal distribution
+
Real-world time series are usually non-stationary (exhibiting trends or varying variance) but can often be transformed to achieve stationarity (e.g., differencing, log transformations).
v <- filter(w, sides=2, rep(1/3, 3)) ## moving average model
 
par(mfrow=c(2,1))
 
plot.ts(w,main='white noise')
 
plot.ts(v,main='moving average')
 
  
[[Image:SMHS Fig2 Timeseries Analysis.png|500px]]
+
====Autocorrelation and Partial Autocorrelation====
 +
To characterize a time series, we examine its mean and covariance structure. For a stationary process, the '''Autocovariance Function (ACVF)''' at lag <math>h</math> is defined as:
 +
<math>\gamma(h) = E[(x_t - \mu)(x_{t+h} - \mu)]</math>
 +
The '''Autocorrelation Function (ACF)''' standardizes this by the variance <math>\gamma(0)</math>:
 +
<math>\rho(h) = \frac{\gamma(h)}{\gamma(0)}</math>
 +
Properties of the ACF include: <math>\rho(0) = 1</math>, <math>|\rho(h)| \le 1</math>, and <math>\rho(h) = \rho(-h)</math> (it is an even function).
  
## sums based on WN processes
+
The '''Partial Autocorrelation Function (PACF)''' measures the correlation between <math>x_t</math> and <math>x_{t+h}</math> after removing the linear dependence on the intermediate observations <math>x_{t+1}, \dots, x_{t+h-1}</math>. The ACF and PACF are crucial tools for identifying the order of Autoregressive (AR) and Moving Average (MA) models.
'''ts.plot(w,v,lty=2:1,col=1:2,lwd=1:2)'''
 
  
[[Image:SMHS Fig3 TimeSeries Analysis.png|500px]]
+
=====Estimation of ACF=====
 +
Given <math>T</math> observations, the sample autocovariance is estimated as:
 +
<math>\hat{\gamma}(h) = \frac{1}{T} \sum_{t=1}^{T-h} (x_t - \bar{x})(x_{t+h} - \bar{x})</math>
 +
The sample ACF is <math>\hat{\rho}(h) = \hat{\gamma}(h) / \hat{\gamma}(0)</math>.
 +
Under the null hypothesis that the true process is White Noise, the large-sample standard error of the sample ACF is approximately:
 +
<math>SE(\hat{\rho}(h)) \approx \frac{1}{\sqrt{T}}</math>
 +
A common rule of thumb is that 95% of the sample autocorrelations for a White Noise process should fall within the bounds <math>\pm 2/\sqrt{T}</math>.
  
The ACF and the sample ACF of some stationary processes
+
====White Noise====
*white noise = $w(t),acf x(t)=w(t)+\frac{1}{3} \left(w(t-1)+w(t-2)+w(t-3) \right),$ then plot the barplot of the acf:
+
A time series <math>w_t</math> is White Noise (<math>WN</math>) if it is weakly stationary with:
 +
<math>\rho(h) = \begin{cases} 1 & \text{if } h=0 \\ 0 & \text{if } h \neq 0 \end{cases}</math>
 +
A special case is Gaussian White Noise, denoted <math>w_t \sim \text{WN}(0, \sigma^2)</math>, where <math>w_t</math> is independently and identically distributed (i.i.d.) according to a normal distribution.
  
RCODE:
+
====Backshift Notation and Linear Operators====
par(mfrow=c(2,1))
+
Modern time series theory heavily utilizes the backshift operator <math>B</math>, defined as <math>Bx_t = x_{t-1}</math>.
barplot(ARMAacf(0,lag.max=10),main='white noise=w(t)')
+
* Differencing can be written as <math>\nabla x_t = (1-B)x_t</math>.
barplot(ARMAacf(ma=c(1/3,1/3,1/3),lag.max=10),main='acfx(t)=w(t)+1/3w(t-1)+1/3w(t-2)+1/3w(t-3)')
+
* An AR(1) model <math>x_t = \phi x_{t-1} + w_t</math> is written as <math>(1-\phi B)x_t = w_t</math>.
  
[[Image:SMHS_Fig4_TimeSeries_Analysis.png|500px]]
+
====Autoregressive (AR) Models====
 +
An autoregressive model of order <math>p</math>, denoted AR(<math>p</math>), models the current value as a linear combination of its past values plus a shock:
 +
<math>x_t = \phi_1 x_{t-1} + \phi_2 x_{t-2} + \dots + \phi_p x_{t-p} + w_t</math>
 +
Using backshift notation: <math>\phi(B)x_t = w_t</math>, where <math>\phi(B) = 1 - \phi_1 B - \dots - \phi_p B^p</math>.
 +
'''Assumptions:''' The process must be stationary, which requires that the roots of <math>\phi(B) = 0</math> lie strictly outside the unit circle.
 +
'''ACF/PACF Patterns:''' The ACF of a stationary AR(<math>p</math>) process decays gradually (tails off), while the PACF cuts off abruptly after lag <math>p</math>.
  
*Theoretical acf of some processes: Autoregressive process, $x_{t}=0.9x_{t-1}+w_{t}.$
+
====Moving Average (MA) Models====
 +
A moving average model of order <math>q</math>, denoted MA(<math>q</math>), models the current value as a linear combination of current and past white noise shocks:
 +
<math>x_t = w_t + \theta_1 w_{t-1} + \theta_2 w_{t-2} + \dots + \theta_q w_{t-q}</math>
 +
Using backshift notation: <math>x_t = \theta(B)w_t</math>, where <math>\theta(B) = 1 + \theta_1 B + \dots + \theta_q B^q</math>.
 +
'''Assumptions:''' MA processes are always stationary. However, they must be '''invertible''', meaning the roots of <math>\theta(B) = 0</math> lie strictly outside the unit circle. Invertibility ensures the process can be represented as an infinite-order AR model.
 +
'''ACF/PACF Patterns:''' The ACF of an MA(<math>q</math>) process cuts off abruptly after lag <math>q</math>, while the PACF decays gradually (tails off).
  
RCODE:
+
====ARIMA Models====
par(mfrow=c(2,1))
+
The Autoregressive Integrated Moving Average model, ARIMA(<math>p, d, q</math>), combines AR and MA models after differencing the data <math>d</math> times to achieve stationarity:
barplot(ARMAacf(c(0.9),lag.max=30),main='acf x(t)=0.9x(t-1)+w(t)')
+
<math>\phi(B)(1-B)^d x_t = \theta(B)w_t</math>
barplot(ARMAacf(c(1,-0.9),lag.max=30),main='acf x(t)=x(t-1)-0.9x(t-2)+w(t)')
+
If seasonality is present, we extend this to the Seasonal ARIMA (SARIMA) model, which incorporates seasonal AR, MA, and differencing terms.
  
 +
In practice, R's ''forecast'' package fits and reports ARIMA models with seasonality, e.g.,
  
[[Image:SMHA_Fig5_TimeSeries_Analysis.png|500px]]
+
<pre>
 +
ARIMA(p,d,q)(P,D,Q)[m]
 +
</pre>
  
*Recognizing patterns in probabilistic data
+
These six parameters define how the model handles '''trends''', '''cycles''', and '''shocks''' in the data.
**Example 1: compare White noise & Autoregressive process $x_{t}=w_{t} \& x_{t}=x_{t-1}-0.9x_{t-2}+w_{t})$
 
  
RCODE:
+
For instance, this model ''ARIMA(2,1,1)(0,1,0)[12]'' has two parts, the '''Non-Seasonal''' component (first 3 parameters), and the '''Seasonal''' component, (the last 3 parameters) .
w1 <- rnorm(200) ## generate 200 data from the standard normal distribution
 
x <- filter(w1, filter=c(1,-0.9),method='recursive')[-(1:50)]
 
w2 <- w1[-(1:50)]
 
xt <- ts(x)
 
par(mfrow=c(2,2))
 
plot.ts(w,main='white noise')
 
acf(w2,lag.max=25)
 
plot.ts(xt,main='autoregression')
 
acf(xt,lag.max=25)
 
  
[[Image:SMHS_Fig6_TimeSeries_Analysis.png|500px]]
+
* Non-Seasonal Component: ''(p, d, q)'': These parameters handle the short-term relationships between consecutive observations.
 +
** '''p''' (Autoregressive - AR): The number of lag observations included in the model. A value of 2 means the model uses the two previous time points to predict the next one.
 +
** '''d''' (Degree of Differencing): The number of times the raw observations are differenced to make the data stationary (removing trends). A value of 1 means the model is looking at the change between points rather than the absolute values.
 +
** '''q''' (Moving Average - MA): The size of the moving average window applied to forecast errors. A value of 1 means the model accounts for the error made in the previous prediction.
  
*Apparently, there is almost no autocorrelation in white noise, while there is large autocorrelation in second series.
+
* Seasonal Component: ''(P, D, Q)'': These parameters handle repeating patterns (like the AirPassengers monthly spikes).
 +
** P (Seasonal Autoregressive): Similar to ''p'', but specifically for the seasonal lags. For monthly data, P=1 would mean using the value from the same month last year to predict this month, in the output, this is 0.
 +
** D (Seasonal Differencing): The number of seasonal differences. A value of 1 means the model subtracted last year's value from this year's value to remove seasonal trends.
 +
** Q (Seasonal Moving Average): Similar to ''q'', but for seasonal forecast errors, in the output, this is 0.
 +
** [m] (Seasonal Period): The number of periods in each season. For the AirPassengers dataset, this is 12 because the data is monthly and repeats every year.
  
Example 2: AR & MA $x_{t}=x_{t-1}-0.9x_{t-2}+w_{t}\& v_{t}=\frac{1}{3}\left(w_{t-1}+w_{t}+w_{t+1} \right))$
+
For a specific model: '''ARIMA(2,1,1)(0,1,0)[12]'''
  
RCODE:
+
{| class="wikitable"
v <- filter(w1, sides=2, rep(1/3, 3))
+
! Parameter !! Value !! Meaning
vt <- ts(v)[2:199]
+
|-
par(mfrow=c(2,2))
+
| p || 2 || Uses 2 lags for the AR part.
plot.ts(xt,main='white noise')
+
|-
acf(xt,lag.max=25)
+
| d || 1 || Applied 1st order differencing to remove the general trend.
plot.ts(vt,main='autoregression')
+
|-
acf(vt,lag.max=25)
+
| q || 1 || Uses 1 lag for the Moving Average error correction.
 +
|-
 +
| P || 0 || No seasonal autoregressive terms.
 +
|-
 +
| D || 1 || Applied seasonal differencing (Value - Value same month last year).
 +
|-
 +
| Q || 0 || No seasonal moving average terms.
 +
|-
 +
| m || 12 || Data follows a 12-month annual cycle.
 +
|}
  
[[Image:SMHS_Fig7_TimeSeries_Analysis.png|500px]]
+
* '''Control Parameters''': The Stepwise and Approximation Flags, stepwise=FALSE and approximation=FALSE, force ''auto.arima'' to search more thoroughly through all possible combinations of these 6 parameters rather than using a shortcut search. This results in a better fit but takes longer to compute.
  
 +
===Methods and Data Analysis===
  
*ACF, PACF and some examples of data in R
+
====Exploratory Data Analysis (EDA) and Diagnostics====
 +
Before modeling, we must visualize the data, check for outliers, and assess distributional assumptions. A standard approach involves time series plots, Q-Q plots (to check normality), and examining residuals.
  
RCODE:
+
If we fit a simple mean model <math>x_t = \mu + w_t</math>, the residuals <math>e_t = x_t - \hat{\mu}</math> should resemble White Noise. We diagnose this by checking:
data<-
+
1. Zero mean of residuals.
c(28,22,36,26,28,28,26,24,32,30,27,24,33,21,36,32,31,25,24,25,28,36,27,32,34,30,
+
2. Constant variance across time (homoscedasticity).
25,26,26,25,-
+
3. No autocorrelation (checked via ACF plots of residuals).
44,23,21,30,33,29,27,29,28,22,26,27,16,31,29,36,32,28,40,19,37,23,32,29,-
+
4. Normality of residuals (checked via Q-Q plots).
2,24,25,27,24,16,29,20,28,27,39,23)
 
par(mfrow=c(1,2))
 
plot(data)
 
qqnorm(data)
 
qqline(data)
 
  
[[Image:SMHS_Fig8_TimeSeries_Analysis.png|500px]]
+
====Smoothing Techniques====
 +
Smoothing helps identify underlying structures by averaging out the noise.
  
data1<-data[c(-31,-55)]
+
=====Moving Averages=====
par(mfrow=c(1,2))
+
A simple centered moving average smooths <math>x_t</math> using a window of size <math>k</math>:
plot.ts(data1)
+
<math>v_t = \frac{1}{k} \sum_{j=-(k-1)/2}^{(k-1)/2} x_{t+j}</math>
qqnorm(data1)
 
qqline(data1)
 
  
[[Image:SMHS_Fig9_TimeSeries_Analysis.png|500px]]
+
<pre>
 +
# Simulate White Noise and apply Moving Averages
 +
set.seed(123)
 +
w <- ts(rnorm(150))  # generate 150 data points from standard normal distribution
 +
ma3 <- filter(w, sides=2, rep(1/3, 3))  # 3-period centered moving average
 +
ma9 <- filter(w, sides=2, rep(1/9, 9))  # 9-period centered moving average
  
library(astsa)
+
plot.ts(w, main="White Noise vs. Moving Averages", ylab="Value")
acf2(data1)
+
lines(ma3, col="red", lwd=2)
 +
lines(ma9, col="blue", lwd=2)
 +
legend("topright", legend=c("WN", "MA(3)", "MA(9)"), col=c("black", "red", "blue"), lwd=2)
 +
</pre>
 +
*Note: A larger window (lag) results in a smoother time series plot, but introduces more lag in detecting changes.*
  
*Examining the data:  
+
=====Exponential Smoothing=====
Diagnostics of outliers
+
Unlike moving averages, exponential smoothing assigns exponentially decreasing weights as observations get older.
 +
* '''Simple Exponential Smoothing (SES):''' Appropriate for data with no trend or seasonality. The forecast is <math>\hat{x}_{t+1} = \alpha x_t + (1-\alpha)\hat{x}_t</math>, where <math>0 < \alpha < 1</math> is the smoothing parameter.
 +
* '''Holt-Winters Method:''' Extends SES to capture trends (additive or multiplicative) and seasonality.
  
RCODE:
+
<pre>
dat<-  
+
# Using built-in AirPassengers dataset
c(28,22,36,26,28,28,26,24,32,30,27,24,33,21,36,32,31,25,24,25,28,36,27,32,34,30,
+
data(AirPassengers)
25,26,26,25,-44,23,21,30,    33,29,  27,29,28,22,26,27,16,31,29,36,32,28,40,19,
+
plot(AirPassengers, main="AirPassengers Dataset", ylab="Passengers")
37,23,32,29,-2,24,25,27,24,16,29,20,28,27,  39,23)
 
plot.ts(dat)
 
  
 +
# Holt-Winters Additive Model
 +
hw_add <- HoltWinters(AirPassengers, seasonal="additive")
 +
plot(hw_add, main="Holt-Winters Additive Filtering")
 +
</pre>
  
[[Image:SMHS_Fig10_TimeSeries_Analysis.png|500px]]
+
=====Nonparametric Smoothing=====
 +
* '''Loess (Local Polynomial Regression):''' Fits simple models to localized subsets of data. Controlled by the `span` parameter. A larger span yields a smoother curve.
 +
* '''Splines: Piecewise polynomials joined smoothly at "knots." Controlled by degrees of freedom or a smoothing parameter (`spar`).
  
 +
<pre>
 +
# Loess and Spline Smoothing on a time series
 +
t <- time(AirPassengers)
 +
fit_loess <- lowess(t, AirPassengers, f=0.3) # f is the span
 +
fit_spline <- smooth.spline(t, AirPassengers, spar=0.8)
  
Then we can easily tell that there is an outlier in this data series.
+
plot(AirPassengers, main="Nonparametric Smoothing", ylab="Passengers")
+
lines(fit_loess, col="red", lwd=2)
# what if we use a boxplot here?
+
lines(fit_spline, col="blue", lwd=2, lty=2)
boxplot(dat)
+
legend("topleft", legend=c("Loess", "Spline"), col=c("red", "blue"), lwd=2, lty=c(1,2))
 +
</pre>
  
[[Image:SMHS_Fig11_TimeSeries_Analysis.png|500px]]
+
====Modeling Seasonality and Trend====
  
identify(rep(1,length(dat)),dat)  ## identify outliers by clicking on screen
+
=====Classical Decomposition=====
## we identified the outliers are 55 and 31:
+
We can decompose a time series into its trend, seasonal, and irregular components using moving averages or Local Regression (LOESS).
dat[31]
 
[1] -44
 
dat[55]
 
[1] -2
 
datr <- dat[c(-31,-55)]  ## this is the dat with outlier removed
 
qqnorm(datr)
 
qqline(datr)
 
  
 +
<pre>
 +
# Multiplicative decomposition using LOESS
 +
decomp <- stl(AirPassengers, s.window="periodic")
 +
plot(decomp, main="Classical Decomposition via STL")
 +
</pre>
  
[[Image:SMHS_Fig12_TimeSeries.png|500px]]
+
=====Regression with Dummy Variables=====
 +
Seasonality can be explicitly modeled using regression with dummy variables. For monthly data with an intercept, we use 11 dummy variables to avoid the dummy variable trap.
  
 +
<pre>
 +
# Regression with seasonal dummy variables
 +
y <- log(AirPassengers) # Log transform to stabilize variance
 +
t <- time(y)
 +
Q <- factor(cycle(y))  # Creates monthly factors (1-12)
  
## this plot shows that the distribution of the new data series (right)
+
# Fit linear regression with trend and seasonality
is closer to normal compared to the original data (left).
+
reg_model <- lm(y ~ t + Q)
## The effect of outliers by comparing the ACF and PACF of the data
+
summary(reg_model)
series:
 
library(astsa)
 
acf2(dat)
 
acf2(datr)
 
[[Image:SMHS_Fig13_TimeSeries.png|800px]]
 
  
*Diagnostics of the error: consider the model $(x_{t}=\mu+w_{t} )$ check to see if the mean of the probability distribution of error is 0; whether variance of error is constant across time; whether errors are independent and uncorrelated; whether the probability distribution of error is normal.
+
# Diagnostics of the regression model
:::Residual plot for functional form
+
par(mfrow=c(2,2))
 +
plot(reg_model, main="Regression Diagnostics")
 +
</pre>
  
[[Image:SMHS_Fig14_TimeSeries.png|500px]]
+
=====Modern ARIMA Modeling (Box-Jenkins Approach)=====
 +
The modern standard for time series forecasting involves the Box-Jenkins iterative cycle: Identification, Estimation, and Diagnostic Checking.
  
:::Residual plot for equal variance
+
1. Identification: Use ACF and PACF plots, and unit root tests (like Augmented Dickey-Fuller) to determine <math>d</math>. Use algorithms to suggest <math>p</math> and <math>q</math>.
  
[[Image:SMHS_Fig15_TimeSeries.png|500px]]
+
2. Estimation: Maximize the likelihood function to estimate AR and MA coefficients.
  
:::Check if residuals are correlated by plotting $e_{t} vs. e_{t-1}$: there are three possible cases – positive autocorrelation, negative autocorrelation and no autocorrelation.  
+
3. Diagnostic Checking: Ensure residuals are White Noise.
  
*Smoothing: to identify the structure by averaging out the noise since series has some structure plus variation.
+
The forecast package in R automates this efficiently:
::Moving averages:
 
ma3<-filter(datr,sides=2,rep(1/3,3))
 
ma9<-filter(datr,sides=2,rep(1/9,9))
 
plot.ts(datr,ylim=c(15,50),ylab='data')
 
par(new=T)
 
plot.ts(ma3,ylim=c(15,50),col=2,ylab='data')  ## red line indicates ma3
 
par(new=T)
 
plot.ts(ma9,ylim=c(15,50),col=3,ylab='data')  ## green line indicates ma9
 
  
[[Image:SMHS_Fig16_TimeSeries.png|500px]]
+
<pre>
 +
# Install and load forecast package if necessary
 +
# install.packages("forecast")
 +
library(forecast)
  
*Apparently, the bigger the laggings in moving average, the smoother the time series plot of the data series.
+
# dev.off()  # reset display, if necessary
ma3<-filter(datr,sides=2,rep(1/3,3))
 
ma31<-filter(datr,sides=1,rep(1/3,3))
 
plot.ts(datr,ylim=c(15,50),ylab='data')
 
par(new=T)
 
plot.ts(ma3,ylim=c(15,50),col=2,ylab='data')  ## red line indicates ma3
 
par(new=T)
 
plot.ts(ma31,ylim=c(15,50),col=3,ylab='data')  ## green line indicates ma31
 
  
[[Image:SMHS_Fig17_TimeSeries.png|500px]]
+
# auto.arima automatically identifies the best SARIMA model
 +
fit <- auto.arima(AirPassengers, seasonal=TRUE, stepwise=FALSE, approximation=FALSE)
 +
summary(fit)
  
Two sided and one sided filters with same coefficients seem to be similar in their smoothing effect.
+
# Check residuals to ensure they are White Noise
 +
checkresiduals(fit)
  
## the effect of equal vs. unequal weights?
+
# Forecast the next 24 months
ma31<-filter(datr,sides=1,rep(1/3,3))
+
fc <- forecast(fit, h=24)
mawt<-filter(datr,sides=1,c(1/9,1/9,7/9))
+
plot(fc, main="24-Month Forecast using SARIMA")
plot.ts(datr,ylim=c(15,50),ylab='data')
+
</pre>
par(new=T)
 
plot.ts(ma31,ylim=c(15,50),col=2,ylab='data')  ## red line indicates ma31
 
par(new=T)
 
plot.ts(mawt,ylim=c(15,50),col=3,ylab='data')  ## green line indicates mawt
 
  
[[Image:SMHS_Fig18_TimeSeries.png|500px]]
+
Next, let's try to fit '''manually''' another (specific) '''ARIMA(1,1,5)''' model where the (p,d,q) parameters are hard-coded. Then, compare ''fitManual'' to the optimized model (''fit <- auto.arima()''), by contrasting the corresponding ''AIC'', ''BIC'', and log-likelihood estimates of each model, smaller values indicate better model and higher model fidelity.  
  
The smoothing effect of filter with equal weights seems to be bigger than the filter with unequal weights.
+
Also, compare the 24-month forward forecasts of ''fitManual'' vs. the optimized model ''fit''.
  
*Simple Exponential Smoothing (SES): premise – the most recent observations might have the highest predictive value $(\hat{x}_{t} = \alpha x_{t-1} + (1- \alpha) \hat{x}_{t-1} )$, that is new forecast = $\alpha \ actual value + (1-\alpha) \ previous forecast$. This method is appropriate for series with no trend or seasonality.
+
<pre>
 +
fitManual <- arima(AirPassengers, order=c(1,1,5)); summary(fitManual)
  
*For series with trend, can use Holt-Winters Exponential smoothing additive model $\hat{x}_{t+h}=a_{t}+h*b_{t}.$ We smooth the level or permanent component, the trend component and the seasonal component.
+
fcManual <- forecast(fitManual, h=24)
 +
plot(fcManual, main="(Manual) 24-Month Forecast using SARIMA")
  
RCODE:
+
fcManual$model$aic # [1] 1353.116
HoltWinters(series,beta=FALSE,gamma=FALSE) #simple exponential smoothing
+
fcManual$model$loglik # [1] -669.5579
HoltWinters(series,gamma=FALSE) # HoltWinters trend
+
</pre>
  HoltWinters(series,seasonal= 'additive') # HoltWinters additive trend+seasonal
 
HoltWinters(series,seasonal= 'multiplicative') # HoltWinters multiplicative trend+seasonal
 
  
 +
====Outlier Detection====
 +
Outliers can severely distort time series models (inflating variance, skewing ACF). They can be detected visually via time plots and boxplots, or systematically using residuals.
  
 +
<pre>
 +
# Creating a dataset with injected outliers
 +
set.seed(42)
 +
dat <- ts(c(28,22,36,26,28,28,26,24,32,30,27,24,33,21,36,32,31,25,24,25,28,36,27,32,34,30,
 +
            25,26,26,25,-44,23,21,30,33,29,27,29,28,22,26,27,16,31,29,36,32,28,40,19,
 +
            37,23,32,29,-2,24,25,27,24,16,29,20,28,27,39,23))
  
Example: RCODE: (effect of different values of alpha on datr SES)
+
par(mfrow=c(1,2))
par(mfrow=c(3,1))
+
plot.ts(dat, main="Time Series with Outliers")
exp05=HoltWinters(datr,alpha=0.05,beta=FALSE,gamma=FALSE) #simple exponential smoothing
+
boxplot(dat, main="Boxplot Identifying Outliers")
p05=predict(exp05,20)  
 
plot(exp05,p05, main = 'Holt- Winters filtering, alpha=0.05')
 
exp50=HoltWinters(datr, alpha=0.5,beta=FALSE,gamma=FALSE)  #simple exponential smoothing
 
p50=predict(exp50,20)
 
plot(exp50,p50, main = 'Holt-Winters filtering, alpha=0.5')
 
exp95=HoltWinters(datr, alpha=0.95,beta=FALSE,gamma= FALSE)  #simple exponential smoothing
 
p95=predict(exp95,20)
 
plot(exp95,p95,main = "Holt-Winters filtering, alpha=0.95")
 
  
 +
# Identifying and removing outliers based on extreme boxplot bounds
 +
outlier_bounds <- boxplot.stats(dat)$out
 +
dat_clean <- dat[!(dat %in% outlier_bounds)]
  
[[Image:SMHS_Fig19_TimeSeries.png|500px]]
+
# Re-plot clean data
 +
par(mfrow=c(1,1))
 +
plot.ts(dat_clean, main="Cleaned Time Series", col="blue", lwd=2)
 +
</pre>
  
*Comparing two models: we can compare ‘fit’ and forecast ability.
+
==Applications==
  
RCODE:
+
1. '''Epidemiology and Public Health:''' Time series models are essential for tracking disease outbreaks (e.g., COVID-19, influenza). SARIMA models help forecast case counts, accounting for weekly seasonal cycles (e.g., higher reporting on weekdays) and long-term trends, thereby informing hospital resource allocation.
obs <- exp50$\$$x[-1]
+
2. '''Health Economics:''' Forecasting hospital readmission rates, healthcare costs, and resource utilization over time using ARIMA and Exponential Smoothing to optimize budget planning.
fit <- exp50$\$$fitted[,1]
+
3. '''Clinical Monitoring:''' Analyzing ECG readings or continuous glucose monitoring data. Signal processing techniques combined with MA filtering help isolate true physiological signals from high-frequency noise.
r1 <- obs-fit
+
4. '''Macroeconomics:''' As investigated by Nelson and Plosser (1982), understanding whether macroeconomic time series (like GDP or stock prices) are stationary around a deterministic trend or non-stationary stochastic processes fundamentally changes how we interpret economic shocks and formulate policy.
plot(r1)
 
par(mfrow=c(1,2))
 
plot(r1)
 
acf(r1)
 
  
[[Image:SMHS_Fig20_TimeSeries.png|500px]]
+
==Software==
  
*Regression based (nonparametric) smoothing: (1) kernel: extension of a two-sided MA. Use a bandwidth (number of terms) plus a kernel function to smooth a set of values, ''ksmooth''; (2) local linear regression (loess): fit local polynomial regression and join them together, ''loess''; (3) fitting cubic splines: extension of polynomial regression, partition data and fit separate piecewise regression to each section, smooth them together where they join, ''smooth.spine''; (4) many more, ''lowess'', ''supsmu''.
+
Modern time series analysis in R relies on a few key packages:
::•Kernel smoother: define bandwidth; choose a kernel; use robust regression to get smooth estimate; bigger the bandwidth, smoother the result
+
* '''stats''' (Base R): Contains foundational functions like `ts()`, `acf()`, `pacf()`, `arima()`, `HoltWinters()`, and `stl()`.
::•Loess: define the window width; choose a weight function; use polynomial regression and weighted least squares to get smoothest estimate; the bigger span, the smoother the result.
+
* '''forecast''' (now transitioning to '''fable'''): Provides `auto.arima()`, `checkresiduals()`, `forecast()`, and modern tidy time series frameworks (`tsibble`).
::•Splines: divide data into intervals; in each interval, fit a polynomial regression of order p such that the splines are continuous at knots; use a parameter (spar in R) to smooth the splines; the bigger the spar, the smoother the result.
+
* '''astsa''': Applied Statistical Time Series Analysis, highly recommended for its companion datasets and straightforward functions like `acf2()` (combines ACF and PACF in one plot).
::•Issues with regression based (nonparametric) smoothing: (1) the size of S has an important effect on the curve; (2) a span that is too small (meaning that insufficient data fall within the window) produces a curve characterized by a lot of noise, results in a large variance; (3) if the span is too large, the regression will be oversmoothed, and thus the local polynomial may not fit the data well. This may result in loss of important information, and thus the fit will have large bias; (4) we want the smallest span that provides a smooth structure.
+
* '''tseries''': Provides unit root tests like `adf.test()` to check for stationarity.
  
*A regression model for trend: given $y_{t}, t=1,2,…,\cdots,$ T is a time series, fit a regression model for trend. The plot suggests that we fit a trend line, say $y_{t}=\beta_{1}+\beta_{2} t + \varepsilon_{t}$.
+
==Problems==
::•Assumptions: variance of error is constant across observations; errors are independent / uncorrelated across observations; the regressor variables and error are independent; we may also assume probability distribution of error is normal.
 
 
 
RCODE
 
fit <- lm(x~z)
 
summary(fit$\$$coeff)
 
plot.ts(fit$\$$resid)  # plot the residuals of the fitted regression model
 
acf2(fit$\$$resid)  # the ACF and PACF of the residuals of the fitted regression model
 
qqnorm(fit$\$$resid)
 
qqline(fit$\$$resid)
 
 
 
 
*Analyzing patterns in data and modeling: seasonality
 
**Exploratory analysis
 
 
 
RCODE:
 
data=c( 118,132,129,121,135,148,148,136,119,104,118,115,126,141,135,125,
 
149,170,170,158,133,114,140,145,150,178,163,172,178,199,199,184,162,146,
 
166,171,180,193,181,183,218,230,242,209,191,172,194,196,196,236,235,
 
229,243,264,272,237,211,180,201,204,188,235,227,234,264,302,293,259,229,
 
203,229,242,233,267,269,270,315,364,347,312,274,237,278,284,277,317,313,
 
318,374,413,405,255,306,271,306,300)
 
air <- ts(data,start=c(1949,1),frequency=12)
 
y <- log(air)
 
par(mfrow=c(1,2))
 
plot(air)
 
plot(y)
 
 
 
 
 
[[Image:SMHS_Fig21_TimeSeries.png|500px]]
 
 
 
 
 
## from the charts above, taking log helps a little bit with the increasing variability.
 
 
 
monthplot(y)
 
 
 
 
 
[[Image:SMHS_Fig22_TimeSeries.png|500px]]
 
 
 
 
 
acf2(y)
 
 
 
[[Image:SMHS_Fig23_TimeSeries.png|500px]]
 
 
 
*Smoothing: the Holt-Winters method smooths and obtains three parts of a series: level, trend and seasonal. The method requires smoothing constants: your choice or those which minimize the one step ahead forecast error: $\sum_{t=1}^{T} \left(\hat{x}_{t} - x_{t-1} \right )^{2}$, where $t=1,2,\cdots,T$. So at a given time point, smoothed value = (level+h*trend)+seasonal index (additive); data value = (level+h*trend)*seasonal index (multiplicative). Additive: $\hat{x}_{t+h}=a_{t}+h*b_{t}+s_{t+1+(h-1)\mod{p}}$; Multiplicative: $\hat{x}_{t+h}=(a_{t}+h*b_{t})* s_{t+1+(h-1)\mod{p}}.$
 
 
 
 
 
RCODE:
 
par(mfrow=c(2,1))
 
sa <- HoltWinters(air,seasonal=c('additive'))
 
psa <- predict(sa,60)
 
plot(sa,psa,main='Holt-Winters filtering, airline passengers, HW additive')
 
sm <- HoltWinters(air,seasonal=c('multiplicative'))
 
psam <- predict(sm,60)
 
plot(sm,psam,main='Holt-Winters filtering, airline passengers, HW multiplicative')
 
 
 
[[Image:SMHS_Fig24_TimeSeries.png|500px]]
 
 
 
## seasonal decomposition using loess, log(airpassengers)
 
 
 
[[Image:SMHS_Fig25_TimeSeries.png|500px]]
 
 
 
*Regression: estimation, diagnostics, interpretation. Defining dummy variables to capture the seasonal effect in regression. If there is a variable with k categories, there will be k-1 dummies if the intercept is included in the regression (exclude one group); there will be k dummies if the intercept is not included in the regression; if the intercept is included in the regression, the coefficient of a dummy variable is always the difference of means between that group and the excluded group.
 
  
RCODE:
+
1) Consider a signal-plus-noise model of the general form <math>x_{t}=s_{t}+w_{t}</math>, where <math>w_{t}</math> is Gaussian white noise with <math>\sigma^{2}_{w} = 1</math>. Simulate and plot <math>n=200</math> observations from each of the following two models.
t <- time(y)
 
Q <- factor(rep(1:12,8))
 
h <- lm(y~time(y)+Q)  # regression with intercept, drops first category = January
 
model.matrix(h)
 
h1 <- lm(y~0+time(y)+Q)  # regression without intercept
 
model.matrix(h1)
 
x1 <- ts(y[1:72],start=c(1949,1),frequency=12)
 
t1 <- t[1:72]
 
Q1 <- Q[1:72]
 
contrasts(Q1)=contr.treatment(12,base=10)
 
h2 <- lm(x1~t1+Q1)  # regression with intercept, drops 10th category = October
 
model.matrix(h2)
 
reg <- lm(x1~t1+Q1)
 
par(mfrow=c(2,2))
 
plot.ts(reg$\$$resid)
 
qqnorm(reg$\$$resid)
 
qqline(reg$\$$resid)
 
acf(reg$\$$resid)
 
pacf(reg$\$$resid)
 
  
[[Image:SMHS_Fig26_TimeSeries.png|500px]]
+
(a) <math>x_{t} = s_{t}+w_{t}</math>, for <math>t=1,2,\dots,200</math>, where <math>s_{t}=\begin{cases} 0 & \text{ if } t=1,2,\dots,100 \\ 10\exp \left\{ -(t-100)/20 \right\} \cos\left( 2\pi t/4 \right) & \text{ if } t=101,102,\dots,200 \end{cases}</math>
  
## forecasts RCODE
+
(b) <math>x_{t}=s_{t}+w_{t}</math>, for <math>t=1,2,\dots,200</math>, where <math>s_{t}=\begin{cases} 0 & \text{ if } t=1,2,\dots,100 \\ 10\exp \left\{ -(t-100)/200 \right\} \cos\left( 2\pi t/4 \right) & \text{ if } t=101,102,\dots,200 \end{cases}</math>
newdata=data.frame(t1=t[73:96], Q1=Q[73:96])
 
preg=predict(reg,newdata,se.fit=TRUE)
 
seup=preg$\$$fit+2*preg$\$$se.fit
 
seup
 
        1       2       3        4        5        6        7        8        9
 
5.577144 5.722867 5.679936 5.667979 5.780376 5.881420 5.889829 5.782526 5.655445
 
      10       11      12      13      14      15      16      17      18
 
5.525933 5.661158 5.681042 5.716577 5.862300 5.819369 5.807412 5.919809 6.020853
 
      19      20      21      22      23      24
 
6.029262 5.921959 5.794878 5.665366 5.800591 5.820475
 
selow=preg$\$$fit-2*preg$\$$se.fit
 
selow
 
        1        2        3        4       5        6        7        8        9
 
5.479442 5.625165 5.582234 5.570277 5.682674 5.783718 5.792127 5.684824 5.557743
 
      10      11      12      13      14      15      16      17      18
 
5.428232 5.563456 5.583340 5.610927 5.756650 5.713720 5.701763 5.814160 5.915203
 
      19      20      21      22      23      24
 
5.923613 5.816309 5.689228 5.559717 5.694941 5.714825
 
  
x1r=y[73:96]
+
(c) Compare the signal modulators <math>(a) \exp \left\{-t /20 \right\}</math> and <math>(b) \exp \left\{-t/200 \right\}</math>, for <math>t=1,2,\dots,100</math>. What is the primary difference in the persistence of the signal?
t1r=t[73:96]
 
plot(t1r,x1r, col='red',lwd=1:2,main='Actual vs Forecast', type='l')
 
lines(t1r,preg$\$$fit, col='green', lwd=1:2)
 
lines(t1r,seup, col='blue', lwd=1:2)
 
lines(t1r,selow, col='blue', lwd=1:2)
 
## red refer actual; green refers to forecast; blue refers to standard error
 
 
 
[[Image:SMHS_Fig27_TimeSeries.png|500px]]
 
 
 
## Holt Winters smoothing: forecasts
 
sa=HoltWinters(x1, seasonal = c("additive"))
 
psa=predict(sa,24)
 
plot(t1r,x1r, col="red",type="l",main = "Holt- Winters filtering, airline passengers, HW additive")
 
lines(t1r, psa, ,col="green")
 
 
 
## red represents the actual, green: forecast
 
 
 
[[Image:SMHS_Fig28_TimeSeries.png|500px]]
 
 
 
sqrt(sum(x1r-psa)^2)) ##HW
 
[1] 0.3988259
 
sqrt(sum((x1r-preg$\$$fit)^2)) ##reg
 
[1] 0.3807665
 
 
 
==Applications==
 
 
 
1)[http://www.statsoft.com/Textbook/Time-Series-Analysis  This article] presents a comprehensive introduction to the filed of time series analysis. It discussed about the common patterns existing in time series data and introduced some commonly used techniques to deal with time series data and ways to analyze time series.
 
 
 
2)[http://www.sciencedirect.com/science/article/pii/0304393282900125  This article] investigates whether macroeconomic time series are better characterized as stationary fluctuations around a deterministic trend or as non-stationary processes that have no tendency to return to a deterministic path. Using long historical time series for the U.S. we are unable to reject the hypothesis that these series are non-stationary stochastic processes with no tendency to return to a trend line. Based on these findings and an unobserved components model for output that decomposes fluctuations into a secular or growth component and a cyclical component we infer that shocks to the former, which we associate with real disturbances, contribute substantially to the variation in observed output. We conclude that macroeconomic models that focus on monetary disturbances as a source of purely transitory fluctuations may never be successful in explaining a large fraction of output variation and that stochastic variation due to real factors is an essential element of any model of macroeconomic fluctuations.
 
 
 
 
 
==Software==
 
See the CODE examples given in this lecture.
 
 
 
==Problems==
 
  
1) Consider a signal-plus-noise model of the general form $x_{t}=s_{t}+w_{t}$, where $w_{t}$ is Gaussian white noise with $\sigma ^{2} _{w} = 1$. Simulate and plot n=200 observations from each of the following two models.
+
2) (a) Generate <math>n=100</math> observations from the autoregression <math>x_{t}=-0.9 x_{t-2} + w_{t}</math>, with <math>\sigma_w=1</math>. Next, apply the moving average filter <math>v_{t}=\left(x_{t}+x_{t-1}+x_{t-2}+x_{t-3} \right)/4</math> to the data you generated. Now, plot <math>x_{t}</math> as a line and superimpose <math>v_{t}</math> as a dashed line. Comment on the behavior of <math>x_{t}</math> and how applying the moving average filter changes that behavior. [Hint: use <code>v <- filter(x, rep(1/4,4), sides=1)</code> for the filter].
  
(a) $x_{t} = s_{t}+w_{t},$ for $t=1,2,\cdots,200,$ where $s_{t}=\begin{cases} 0 & \text{ if } t=1,2,\cdots,100 \\  & 10\,exp \left \{ -(t-100)/20) \right \} cos\left ( 2\pi t/4 \right )\text{ if } t=101,102,\cdots,200 \end{cases}.$
+
(b) Repeat (a) but with <math>x_{t}=\cos \left( 2 \pi t/4 \right)</math>.
  
(b) $x_{t}=s_{t}+w_{t}, for t=1,2,\cdots, 200,$ where $s_{t}=\begin{cases} 0 & \text{ if } t=1,2,\cdots,100 \\  & 10\,exp \left \{ -(t-100)/200) \right \} cos\left ( 2\pi t/4 \right )\text{ if } t=101,102,\cdots,200 \end{cases}.$
+
(c) Repeat (b) but with added <math>N(0,1)</math> noise, <math>x_{t}=\cos \left(2 \pi t/4 \right) + w_{t}</math>.
  
(c) Compare the signal modulators $(a) exp \left \{-t/20 \right \} and (b) exp \left \{-t/200 \right \}, for t=1,2, cdots,100.$
+
(d) Compare and contrast the smoothing effects observed in <math>(a) - (c)</math>.
  
 +
3) For the two series, <math>x_{t}</math> in Problem 1 (a) and (b):
  
2) (a) Generate $n=100$ observations from the autoregression $x_{t}=-0.9 x_{t-2} +w_{t}, with \sigma {w}=1,$ using the method described in Example 1.10, page 13 given in the textbook. Next, apply the moving average filter $v_{t}=\left(x_{t}+x_{t-1}+x_{t-2}+x_{t-3} \right )/4 to x_{t},$ to the data you generated. Now, plot $x_{t}$ as a line and superimpose $v_{t}$ as a dashed line. Comment on the behavior of $x_{t}$ and how to apply the moving average filter changes that behavior. [Hint: use $v=filter(x,rep(1/4,4),sides=1)$ for the filter].
+
(a) Compute and plot the mean functions <math>\mu_{x}(t)</math> for <math>t=1,2,\dots,200</math>.
  
(b) Repeat (a) but with $x_{t}=cos \left( 2 \pi t/4 \right).$
+
(b) Explain why these models are not stationary.
  
(c) Repeat (b) but with added $N(0,1)$ noise, $x_{t}=cos \left(2 \pi t/4 \right) +w_{t}.$
+
4) Consider the time series <math>x_{t} = \beta_{1} + \beta_{2} t + w_{t}</math>, where <math>\beta_{1}</math> and <math>\beta_{2}</math> are known constants and <math>w_{t}</math> is a white noise process with variance <math>\sigma^{2}_{w}</math>.
  
(d) Compare and contrast $(a) – (c).$
+
(a) Determine whether <math>x_{t}</math> is stationary. Justify your answer mathematically.
  
3) For the two series, $x_{t}$ in 6.1 (a) and (b):
+
(b) Show that the process <math>y_{t}=x_{t} - x_{t-1}</math> is stationary.
  
(a) Compute and plot the mean functions $\mu _{x} (t) for t=1,2,\cdots,200.$
+
(c) Show that the mean of the moving average <math>v_{t}= \frac{1}{2q+1} \sum_{j=-q}^{q} x_{t-j}</math> is <math>\beta_{1} + \beta_{2} t</math>, and provide a simplified expression for its autocovariance function.
  
(b) Calculate the autocovariance functions, $\gamma _{x} (s,t), for s,t=1,2, \cdots, 200.$
+
5) A time series with a periodic component can be constructed from <math>x_{t} = U_{1} \sin(2 \pi \omega_{0} t) + U_{2} \cos(2 \pi \omega_{0} t)</math>, where <math>U_{1}</math> and <math>U_{2}</math> are independent random variables with zero means and <math>E(U^{2}_{1}) = E(U^{2}_{2}) = \sigma^{2}</math>. The constant <math>\omega_{0}</math> determines the period. Show that this series is weakly stationary with autocovariance function <math>\gamma(h) = \sigma^{2} \cos(2 \pi \omega_{0} h)</math>. [Hint: you will need to refer to standard trigonometric identities for products of sine and cosine functions].
  
4) Consider the time series $x_{t} = \beta {1} + \beta {2} t + w{t},$ where $\beta _{1}$ and $\beta_{2}$ are known constants and $w_{t}$ is a white noise process with variance $\sigma ^{2} _{w}.$
+
6) Suppose we would like to predict a single stationary series <math>x_{t}</math> with zero mean and autocorrelation function <math>\gamma(h)</math> at some time in the future, say <math>t+l</math>, for <math>l>0</math>.
  
(a) Determine whether $x_{t}$ is stationary.
+
(a) If we predict using only <math>x_{t}</math> and some scale multiple <math>A</math>, show that the mean-square prediction error <math>MSE(A)=E \left[ (x_{t+l}-A x_{t})^{2} \right]</math> is minimized by the value <math>A = \rho(l)</math>.
 
(b) Show that the process $y_{t}=x_{t} – x_{t-1}$ is stationary.
 
  
(c) Show that the mean of the moving average $v_{t}= \frac {1}{2q+1} \sum_{j=-q}^{q} x_{t-j},$ is $\beta _{1} + \beta _{2} t,$ and give a simplified expression for the autocovariance function.  
+
(b) Show that the minimum mean-square prediction error is <math>MSE(A)= \gamma(0) \left[ 1- \rho^{2}(l) \right]</math>.
  
 +
(c) Show that if <math>x_{t+l} = Ax_{t}</math> perfectly, then <math>\rho(l) = 1</math> if <math>A > 0</math>, and <math>\rho(l) = -1</math> if <math>A < 0</math>.
  
 +
7) Let <math>w_{t}</math>, for <math>t=0, \pm 1, \pm 2, \dots</math> be a normal white noise process, and consider the series <math>x_{t}=w_{t} w_{t-1}</math>. Determine the mean and autocovariance function of <math>x_{t}</math>, and state whether it is strictly stationary, weakly stationary, or neither.
  
 +
8) Suppose <math>x_{1}, x_{2}, \dots, x_{n}</math> is a sample from the process <math>x_{t} = \mu + w_{t} - 0.8 w_{t-1}</math>, where <math>w_{t} \sim \text{WN}(0, \sigma^{2}_{w})</math>.
  
 +
(a) Show that the mean function is <math>E(x_{t})= \mu</math>.
  
 +
(b) Calculate the standard error of <math>\bar{x}</math> for estimating <math>\mu</math>.
  
 +
(c) Compare your result in (b) to the case where <math>x_{t}</math> is pure white noise. Is the standard error smaller or larger? Explain the result in the context of positive/negative autocorrelation.
  
 +
9) Although the model in Problem 1 (a) is not stationary, the sample ACF can still be computed and informative. For the data you generated in that problem, calculate and plot the sample ACF. Why does the ACF fail to decay to zero rapidly?
  
 +
10)
  
 +
(a) Simulate a series of <math>n=500</math> Gaussian white noise observations and compute the sample ACF, <math>\hat{\rho}(h)</math>, to lag 20. Compare the sample ACF you obtain to the theoretical ACF <math>\rho(h)</math>. Do approximately 95% of the sample autocorrelations fall inside the bounds <math>\pm 2/\sqrt{500}</math>?
  
 +
(b) Repeat part (a) using only <math>n=50</math>. How does changing <math>n</math> affect the variability of the sample ACF and the reliability of the confidence bounds?
  
 +
===References===
 +
* Cryer, J. D., & Chan, K. S. (2008). ''Time Series Analysis: With Applications in R''. Springer.
 +
* Hyndman, R. J., & Athanasopoulos, G. (2021). ''Forecasting: Principles and Practice'' (3rd ed.). OTexts. (Available freely online at otexts.com/fpp3)
 +
* Nelson, C. R., & Plosser, C. I. (1982). Trends and random walks in macroeconomic time series: Some evidence and implications. ''Journal of Monetary Economics'', 10(2), 139-162.
 +
* Shumway, R. H., & Stoffer, D. S. (2017). ''Time Series Analysis and Its Applications: With R Examples'' (4th ed.). Springer.
  
  
 
<hr>
 
<hr>
* SOCR Home page: http://www.socr.umich.edu
+
* SOCR Home page: https://socr.umich.edu
  
{{translate|pageName=http://wiki.socr.umich.edu/index.php?title=SMHS_TimeSeries}}
+
{{translate|pageName=https://wiki.socr.umich.edu/index.php?title=SMHS_TimeSeries}}

Latest revision as of 14:45, 9 April 2026

Scientific Methods for Health Sciences - Time Series Analysis

Overview

Time series data is a sequence of data points measured at successive, equally spaced points in time. Time series analysis is utilized across various disciplines, such as monitoring industrial processes, tracking business metrics, and analyzing longitudinal health records. In this module, we present a comprehensive introduction to time series data and explore both classical and modern techniques used in this rapidly growing field. The goal is to extract meaningful statistics, identify underlying patterns (such as trends and seasonality), and construct predictive models. We will illustrate the application of these techniques using interactive examples in R.

Motivation

Economic data (e.g., daily share prices, monthly sales, annual income) and biophysical data (e.g., daily temperature, ECG readings, epidemiological counts) are all examples of time series data. A time series is an ordered sequence of values of a variable measured at equally spaced time intervals. What are the most effective ways to model time series data? How can we extract actionable information, make inferences about underlying mechanisms, and forecast future observations? Answering these questions requires understanding the mathematical properties of time series processes and applying robust statistical modeling techniques.

Theory

Components of Time Series

A time series \(x_t\) can generally be decomposed into three primary components, plus an irregular error term:

  • Trend Component (\(T_t\)): A long-run increase or decrease over time (overall upward or downward movement). The trend can be linear or non-linear.
  • Seasonal Component (\(S_t\)): Short-term, regular wave-like patterns completing a cycle within a fixed period (e.g., daily, monthly, quarterly).
  • Cyclical Component (\(C_t\)): Long-term wave-like patterns with variable lengths, often measured peak-to-peak or trough-to-trough, typically tied to economic or biological cycles.
  • Irregular Component (\(I_t\)): Unpredictable, random residuals, or "noise," which may be due to random variations or unusual events.

These components can combine additively or multiplicatively:

  • Additive Model: \(x_{t}=T_{t}+S_{t}+C_{t}+I_{t}\)
  • Multiplicative Model: \(x_{t}=T_{t} \times S_{t} \times C_{t} \times I_{t}\)

A logarithmic transformation can convert a multiplicative model into an additive one\[\log(x_{t})=\log(T_{t})+\log(S_{t})+\log(C_{t})+\log(I_{t})\].

  • Probabilistic models: The components still add or multiply together across time.

SMHS Fig 1 Times Series Analysis.png

Stationarity

Most statistical forecasting methods assume that a time series is approximately stationary. A time series is said to be strictly stationary if its statistical properties are unaffected by a shift in time. In practice, we rely on weak stationarity, which requires: 1. The mean is constant over time\[E[x_t] = \mu\] for all \(t\). 2. The variance is finite and constant\[Var(x_t) = \gamma(0) < \infty\]. 3. The autocovariance between any two points depends only on the time lag \(h\) between them\[Cov(x_t, x_{t+h}) = \gamma(h)\] for all \(t\) and \(h\).

Real-world time series are usually non-stationary (exhibiting trends or varying variance) but can often be transformed to achieve stationarity (e.g., differencing, log transformations).

Autocorrelation and Partial Autocorrelation

To characterize a time series, we examine its mean and covariance structure. For a stationary process, the Autocovariance Function (ACVF) at lag \(h\) is defined as\[\gamma(h) = E[(x_t - \mu)(x_{t+h} - \mu)]\] The Autocorrelation Function (ACF) standardizes this by the variance \(\gamma(0)\)\[\rho(h) = \frac{\gamma(h)}{\gamma(0)}\] Properties of the ACF include\[\rho(0) = 1\], \(|\rho(h)| \le 1\), and \(\rho(h) = \rho(-h)\) (it is an even function).

The Partial Autocorrelation Function (PACF) measures the correlation between \(x_t\) and \(x_{t+h}\) after removing the linear dependence on the intermediate observations \(x_{t+1}, \dots, x_{t+h-1}\). The ACF and PACF are crucial tools for identifying the order of Autoregressive (AR) and Moving Average (MA) models.

Estimation of ACF

Given \(T\) observations, the sample autocovariance is estimated as\[\hat{\gamma}(h) = \frac{1}{T} \sum_{t=1}^{T-h} (x_t - \bar{x})(x_{t+h} - \bar{x})\] The sample ACF is \(\hat{\rho}(h) = \hat{\gamma}(h) / \hat{\gamma}(0)\). Under the null hypothesis that the true process is White Noise, the large-sample standard error of the sample ACF is approximately\[SE(\hat{\rho}(h)) \approx \frac{1}{\sqrt{T}}\] A common rule of thumb is that 95% of the sample autocorrelations for a White Noise process should fall within the bounds \(\pm 2/\sqrt{T}\).

White Noise

A time series \(w_t\) is White Noise (\(WN\)) if it is weakly stationary with\[\rho(h) = \begin{cases} 1 & \text{if } h=0 \\ 0 & \text{if } h \neq 0 \end{cases}\] A special case is Gaussian White Noise, denoted \(w_t \sim \text{WN}(0, \sigma^2)\), where \(w_t\) is independently and identically distributed (i.i.d.) according to a normal distribution.

Backshift Notation and Linear Operators

Modern time series theory heavily utilizes the backshift operator \(B\), defined as \(Bx_t = x_{t-1}\).

  • Differencing can be written as \(\nabla x_t = (1-B)x_t\).
  • An AR(1) model \(x_t = \phi x_{t-1} + w_t\) is written as \((1-\phi B)x_t = w_t\).

Autoregressive (AR) Models

An autoregressive model of order \(p\), denoted AR(\(p\)), models the current value as a linear combination of its past values plus a shock\[x_t = \phi_1 x_{t-1} + \phi_2 x_{t-2} + \dots + \phi_p x_{t-p} + w_t\] Using backshift notation\[\phi(B)x_t = w_t\], where \(\phi(B) = 1 - \phi_1 B - \dots - \phi_p B^p\). Assumptions: The process must be stationary, which requires that the roots of \(\phi(B) = 0\) lie strictly outside the unit circle. ACF/PACF Patterns: The ACF of a stationary AR(\(p\)) process decays gradually (tails off), while the PACF cuts off abruptly after lag \(p\).

Moving Average (MA) Models

A moving average model of order \(q\), denoted MA(\(q\)), models the current value as a linear combination of current and past white noise shocks\[x_t = w_t + \theta_1 w_{t-1} + \theta_2 w_{t-2} + \dots + \theta_q w_{t-q}\] Using backshift notation\[x_t = \theta(B)w_t\], where \(\theta(B) = 1 + \theta_1 B + \dots + \theta_q B^q\). Assumptions: MA processes are always stationary. However, they must be invertible, meaning the roots of \(\theta(B) = 0\) lie strictly outside the unit circle. Invertibility ensures the process can be represented as an infinite-order AR model. ACF/PACF Patterns: The ACF of an MA(\(q\)) process cuts off abruptly after lag \(q\), while the PACF decays gradually (tails off).

ARIMA Models

The Autoregressive Integrated Moving Average model, ARIMA(\(p, d, q\)), combines AR and MA models after differencing the data \(d\) times to achieve stationarity\[\phi(B)(1-B)^d x_t = \theta(B)w_t\] If seasonality is present, we extend this to the Seasonal ARIMA (SARIMA) model, which incorporates seasonal AR, MA, and differencing terms.

In practice, R's forecast package fits and reports ARIMA models with seasonality, e.g.,

ARIMA(p,d,q)(P,D,Q)[m]

These six parameters define how the model handles trends, cycles, and shocks in the data.

For instance, this model ARIMA(2,1,1)(0,1,0)[12] has two parts, the Non-Seasonal component (first 3 parameters), and the Seasonal component, (the last 3 parameters) .

  • Non-Seasonal Component: (p, d, q): These parameters handle the short-term relationships between consecutive observations.
    • p (Autoregressive - AR): The number of lag observations included in the model. A value of 2 means the model uses the two previous time points to predict the next one.
    • d (Degree of Differencing): The number of times the raw observations are differenced to make the data stationary (removing trends). A value of 1 means the model is looking at the change between points rather than the absolute values.
    • q (Moving Average - MA): The size of the moving average window applied to forecast errors. A value of 1 means the model accounts for the error made in the previous prediction.
  • Seasonal Component: (P, D, Q): These parameters handle repeating patterns (like the AirPassengers monthly spikes).
    • P (Seasonal Autoregressive): Similar to p, but specifically for the seasonal lags. For monthly data, P=1 would mean using the value from the same month last year to predict this month, in the output, this is 0.
    • D (Seasonal Differencing): The number of seasonal differences. A value of 1 means the model subtracted last year's value from this year's value to remove seasonal trends.
    • Q (Seasonal Moving Average): Similar to q, but for seasonal forecast errors, in the output, this is 0.
    • [m] (Seasonal Period): The number of periods in each season. For the AirPassengers dataset, this is 12 because the data is monthly and repeats every year.

For a specific model: ARIMA(2,1,1)(0,1,0)[12]

Parameter Value Meaning
p 2 Uses 2 lags for the AR part.
d 1 Applied 1st order differencing to remove the general trend.
q 1 Uses 1 lag for the Moving Average error correction.
P 0 No seasonal autoregressive terms.
D 1 Applied seasonal differencing (Value - Value same month last year).
Q 0 No seasonal moving average terms.
m 12 Data follows a 12-month annual cycle.
  • Control Parameters: The Stepwise and Approximation Flags, stepwise=FALSE and approximation=FALSE, force auto.arima to search more thoroughly through all possible combinations of these 6 parameters rather than using a shortcut search. This results in a better fit but takes longer to compute.

Methods and Data Analysis

Exploratory Data Analysis (EDA) and Diagnostics

Before modeling, we must visualize the data, check for outliers, and assess distributional assumptions. A standard approach involves time series plots, Q-Q plots (to check normality), and examining residuals.

If we fit a simple mean model \(x_t = \mu + w_t\), the residuals \(e_t = x_t - \hat{\mu}\) should resemble White Noise. We diagnose this by checking: 1. Zero mean of residuals. 2. Constant variance across time (homoscedasticity). 3. No autocorrelation (checked via ACF plots of residuals). 4. Normality of residuals (checked via Q-Q plots).

Smoothing Techniques

Smoothing helps identify underlying structures by averaging out the noise.

Moving Averages

A simple centered moving average smooths \(x_t\) using a window of size \(k\)\[v_t = \frac{1}{k} \sum_{j=-(k-1)/2}^{(k-1)/2} x_{t+j}\]

# Simulate White Noise and apply Moving Averages
set.seed(123)
w <- ts(rnorm(150))  # generate 150 data points from standard normal distribution
ma3 <- filter(w, sides=2, rep(1/3, 3))  # 3-period centered moving average
ma9 <- filter(w, sides=2, rep(1/9, 9))  # 9-period centered moving average

plot.ts(w, main="White Noise vs. Moving Averages", ylab="Value")
lines(ma3, col="red", lwd=2)
lines(ma9, col="blue", lwd=2)
legend("topright", legend=c("WN", "MA(3)", "MA(9)"), col=c("black", "red", "blue"), lwd=2)
  • Note: A larger window (lag) results in a smoother time series plot, but introduces more lag in detecting changes.*
Exponential Smoothing

Unlike moving averages, exponential smoothing assigns exponentially decreasing weights as observations get older.

  • Simple Exponential Smoothing (SES): Appropriate for data with no trend or seasonality. The forecast is \(\hat{x}_{t+1} = \alpha x_t + (1-\alpha)\hat{x}_t\), where \(0 < \alpha < 1\) is the smoothing parameter.
  • Holt-Winters Method: Extends SES to capture trends (additive or multiplicative) and seasonality.
# Using built-in AirPassengers dataset
data(AirPassengers)
plot(AirPassengers, main="AirPassengers Dataset", ylab="Passengers")

# Holt-Winters Additive Model
hw_add <- HoltWinters(AirPassengers, seasonal="additive")
plot(hw_add, main="Holt-Winters Additive Filtering")
Nonparametric Smoothing
  • Loess (Local Polynomial Regression): Fits simple models to localized subsets of data. Controlled by the `span` parameter. A larger span yields a smoother curve.
  • Splines: Piecewise polynomials joined smoothly at "knots." Controlled by degrees of freedom or a smoothing parameter (`spar`).
# Loess and Spline Smoothing on a time series
t <- time(AirPassengers)
fit_loess <- lowess(t, AirPassengers, f=0.3) # f is the span
fit_spline <- smooth.spline(t, AirPassengers, spar=0.8)

plot(AirPassengers, main="Nonparametric Smoothing", ylab="Passengers")
lines(fit_loess, col="red", lwd=2)
lines(fit_spline, col="blue", lwd=2, lty=2)
legend("topleft", legend=c("Loess", "Spline"), col=c("red", "blue"), lwd=2, lty=c(1,2))

Modeling Seasonality and Trend

Classical Decomposition

We can decompose a time series into its trend, seasonal, and irregular components using moving averages or Local Regression (LOESS).

# Multiplicative decomposition using LOESS
decomp <- stl(AirPassengers, s.window="periodic")
plot(decomp, main="Classical Decomposition via STL")
Regression with Dummy Variables

Seasonality can be explicitly modeled using regression with dummy variables. For monthly data with an intercept, we use 11 dummy variables to avoid the dummy variable trap.

# Regression with seasonal dummy variables
y <- log(AirPassengers) # Log transform to stabilize variance
t <- time(y)
Q <- factor(cycle(y))   # Creates monthly factors (1-12)

# Fit linear regression with trend and seasonality
reg_model <- lm(y ~ t + Q)
summary(reg_model)

# Diagnostics of the regression model
par(mfrow=c(2,2))
plot(reg_model, main="Regression Diagnostics")
Modern ARIMA Modeling (Box-Jenkins Approach)

The modern standard for time series forecasting involves the Box-Jenkins iterative cycle: Identification, Estimation, and Diagnostic Checking.

1. Identification: Use ACF and PACF plots, and unit root tests (like Augmented Dickey-Fuller) to determine \(d\). Use algorithms to suggest \(p\) and \(q\).

2. Estimation: Maximize the likelihood function to estimate AR and MA coefficients.

3. Diagnostic Checking: Ensure residuals are White Noise.

The forecast package in R automates this efficiently:

# Install and load forecast package if necessary
# install.packages("forecast")
library(forecast)

# dev.off()  # reset display, if necessary

# auto.arima automatically identifies the best SARIMA model
fit <- auto.arima(AirPassengers, seasonal=TRUE, stepwise=FALSE, approximation=FALSE)
summary(fit)

# Check residuals to ensure they are White Noise
checkresiduals(fit)

# Forecast the next 24 months
fc <- forecast(fit, h=24)
plot(fc, main="24-Month Forecast using SARIMA")

Next, let's try to fit manually another (specific) ARIMA(1,1,5) model where the (p,d,q) parameters are hard-coded. Then, compare fitManual to the optimized model (fit <- auto.arima()), by contrasting the corresponding AIC, BIC, and log-likelihood estimates of each model, smaller values indicate better model and higher model fidelity.

Also, compare the 24-month forward forecasts of fitManual vs. the optimized model fit.

fitManual <- arima(AirPassengers, order=c(1,1,5)); summary(fitManual)

fcManual <- forecast(fitManual, h=24)
plot(fcManual, main="(Manual) 24-Month Forecast using SARIMA")

fcManual$model$aic  # [1] 1353.116
fcManual$model$loglik  # [1] -669.5579

Outlier Detection

Outliers can severely distort time series models (inflating variance, skewing ACF). They can be detected visually via time plots and boxplots, or systematically using residuals.

# Creating a dataset with injected outliers
set.seed(42)
dat <- ts(c(28,22,36,26,28,28,26,24,32,30,27,24,33,21,36,32,31,25,24,25,28,36,27,32,34,30,
            25,26,26,25,-44,23,21,30,33,29,27,29,28,22,26,27,16,31,29,36,32,28,40,19,
            37,23,32,29,-2,24,25,27,24,16,29,20,28,27,39,23))

par(mfrow=c(1,2))
plot.ts(dat, main="Time Series with Outliers")
boxplot(dat, main="Boxplot Identifying Outliers")

# Identifying and removing outliers based on extreme boxplot bounds
outlier_bounds <- boxplot.stats(dat)$out
dat_clean <- dat[!(dat %in% outlier_bounds)]

# Re-plot clean data
par(mfrow=c(1,1))
plot.ts(dat_clean, main="Cleaned Time Series", col="blue", lwd=2)

Applications

1. Epidemiology and Public Health: Time series models are essential for tracking disease outbreaks (e.g., COVID-19, influenza). SARIMA models help forecast case counts, accounting for weekly seasonal cycles (e.g., higher reporting on weekdays) and long-term trends, thereby informing hospital resource allocation. 2. Health Economics: Forecasting hospital readmission rates, healthcare costs, and resource utilization over time using ARIMA and Exponential Smoothing to optimize budget planning. 3. Clinical Monitoring: Analyzing ECG readings or continuous glucose monitoring data. Signal processing techniques combined with MA filtering help isolate true physiological signals from high-frequency noise. 4. Macroeconomics: As investigated by Nelson and Plosser (1982), understanding whether macroeconomic time series (like GDP or stock prices) are stationary around a deterministic trend or non-stationary stochastic processes fundamentally changes how we interpret economic shocks and formulate policy.

Software

Modern time series analysis in R relies on a few key packages:

  • stats (Base R): Contains foundational functions like `ts()`, `acf()`, `pacf()`, `arima()`, `HoltWinters()`, and `stl()`.
  • forecast (now transitioning to fable): Provides `auto.arima()`, `checkresiduals()`, `forecast()`, and modern tidy time series frameworks (`tsibble`).
  • astsa: Applied Statistical Time Series Analysis, highly recommended for its companion datasets and straightforward functions like `acf2()` (combines ACF and PACF in one plot).
  • tseries: Provides unit root tests like `adf.test()` to check for stationarity.

Problems

1) Consider a signal-plus-noise model of the general form \(x_{t}=s_{t}+w_{t}\), where \(w_{t}\) is Gaussian white noise with \(\sigma^{2}_{w} = 1\). Simulate and plot \(n=200\) observations from each of the following two models.

(a) \(x_{t} = s_{t}+w_{t}\), for \(t=1,2,\dots,200\), where \(s_{t}=\begin{cases} 0 & \text{ if } t=1,2,\dots,100 \\ 10\exp \left\{ -(t-100)/20 \right\} \cos\left( 2\pi t/4 \right) & \text{ if } t=101,102,\dots,200 \end{cases}\)

(b) \(x_{t}=s_{t}+w_{t}\), for \(t=1,2,\dots,200\), where \(s_{t}=\begin{cases} 0 & \text{ if } t=1,2,\dots,100 \\ 10\exp \left\{ -(t-100)/200 \right\} \cos\left( 2\pi t/4 \right) & \text{ if } t=101,102,\dots,200 \end{cases}\)

(c) Compare the signal modulators \((a) \exp \left\{-t /20 \right\}\) and \((b) \exp \left\{-t/200 \right\}\), for \(t=1,2,\dots,100\). What is the primary difference in the persistence of the signal?

2) (a) Generate \(n=100\) observations from the autoregression \(x_{t}=-0.9 x_{t-2} + w_{t}\), with \(\sigma_w=1\). Next, apply the moving average filter \(v_{t}=\left(x_{t}+x_{t-1}+x_{t-2}+x_{t-3} \right)/4\) to the data you generated. Now, plot \(x_{t}\) as a line and superimpose \(v_{t}\) as a dashed line. Comment on the behavior of \(x_{t}\) and how applying the moving average filter changes that behavior. [Hint: use v <- filter(x, rep(1/4,4), sides=1) for the filter].

(b) Repeat (a) but with \(x_{t}=\cos \left( 2 \pi t/4 \right)\).

(c) Repeat (b) but with added \(N(0,1)\) noise, \(x_{t}=\cos \left(2 \pi t/4 \right) + w_{t}\).

(d) Compare and contrast the smoothing effects observed in \((a) - (c)\).

3) For the two series, \(x_{t}\) in Problem 1 (a) and (b):

(a) Compute and plot the mean functions \(\mu_{x}(t)\) for \(t=1,2,\dots,200\).

(b) Explain why these models are not stationary.

4) Consider the time series \(x_{t} = \beta_{1} + \beta_{2} t + w_{t}\), where \(\beta_{1}\) and \(\beta_{2}\) are known constants and \(w_{t}\) is a white noise process with variance \(\sigma^{2}_{w}\).

(a) Determine whether \(x_{t}\) is stationary. Justify your answer mathematically.

(b) Show that the process \(y_{t}=x_{t} - x_{t-1}\) is stationary.

(c) Show that the mean of the moving average \(v_{t}= \frac{1}{2q+1} \sum_{j=-q}^{q} x_{t-j}\) is \(\beta_{1} + \beta_{2} t\), and provide a simplified expression for its autocovariance function.

5) A time series with a periodic component can be constructed from \(x_{t} = U_{1} \sin(2 \pi \omega_{0} t) + U_{2} \cos(2 \pi \omega_{0} t)\), where \(U_{1}\) and \(U_{2}\) are independent random variables with zero means and \(E(U^{2}_{1}) = E(U^{2}_{2}) = \sigma^{2}\). The constant \(\omega_{0}\) determines the period. Show that this series is weakly stationary with autocovariance function \(\gamma(h) = \sigma^{2} \cos(2 \pi \omega_{0} h)\). [Hint: you will need to refer to standard trigonometric identities for products of sine and cosine functions].

6) Suppose we would like to predict a single stationary series \(x_{t}\) with zero mean and autocorrelation function \(\gamma(h)\) at some time in the future, say \(t+l\), for \(l>0\).

(a) If we predict using only \(x_{t}\) and some scale multiple \(A\), show that the mean-square prediction error \(MSE(A)=E \left[ (x_{t+l}-A x_{t})^{2} \right]\) is minimized by the value \(A = \rho(l)\).

(b) Show that the minimum mean-square prediction error is \(MSE(A)= \gamma(0) \left[ 1- \rho^{2}(l) \right]\).

(c) Show that if \(x_{t+l} = Ax_{t}\) perfectly, then \(\rho(l) = 1\) if \(A > 0\), and \(\rho(l) = -1\) if \(A < 0\).

7) Let \(w_{t}\), for \(t=0, \pm 1, \pm 2, \dots\) be a normal white noise process, and consider the series \(x_{t}=w_{t} w_{t-1}\). Determine the mean and autocovariance function of \(x_{t}\), and state whether it is strictly stationary, weakly stationary, or neither.

8) Suppose \(x_{1}, x_{2}, \dots, x_{n}\) is a sample from the process \(x_{t} = \mu + w_{t} - 0.8 w_{t-1}\), where \(w_{t} \sim \text{WN}(0, \sigma^{2}_{w})\).

(a) Show that the mean function is \(E(x_{t})= \mu\).

(b) Calculate the standard error of \(\bar{x}\) for estimating \(\mu\).

(c) Compare your result in (b) to the case where \(x_{t}\) is pure white noise. Is the standard error smaller or larger? Explain the result in the context of positive/negative autocorrelation.

9) Although the model in Problem 1 (a) is not stationary, the sample ACF can still be computed and informative. For the data you generated in that problem, calculate and plot the sample ACF. Why does the ACF fail to decay to zero rapidly?

10)

(a) Simulate a series of \(n=500\) Gaussian white noise observations and compute the sample ACF, \(\hat{\rho}(h)\), to lag 20. Compare the sample ACF you obtain to the theoretical ACF \(\rho(h)\). Do approximately 95% of the sample autocorrelations fall inside the bounds \(\pm 2/\sqrt{500}\)?

(b) Repeat part (a) using only \(n=50\). How does changing \(n\) affect the variability of the sample ACF and the reliability of the confidence bounds?

References

  • Cryer, J. D., & Chan, K. S. (2008). Time Series Analysis: With Applications in R. Springer.
  • Hyndman, R. J., & Athanasopoulos, G. (2021). Forecasting: Principles and Practice (3rd ed.). OTexts. (Available freely online at otexts.com/fpp3)
  • Nelson, C. R., & Plosser, C. I. (1982). Trends and random walks in macroeconomic time series: Some evidence and implications. Journal of Monetary Economics, 10(2), 139-162.
  • Shumway, R. H., & Stoffer, D. S. (2017). Time Series Analysis and Its Applications: With R Examples (4th ed.). Springer.





Translate this page:

(default)
Uk flag.gif

Deutsch
De flag.gif

Español
Es flag.gif

Français
Fr flag.gif

Italiano
It flag.gif

Português
Pt flag.gif

日本語
Jp flag.gif

България
Bg flag.gif

الامارات العربية المتحدة
Ae flag.gif

Suomi
Fi flag.gif

इस भाषा में
In flag.gif

Norge
No flag.png

한국어
Kr flag.gif

中文
Cn flag.gif

繁体中文
Cn flag.gif

Русский
Ru flag.gif

Nederlands
Nl flag.gif

Ελληνικά
Gr flag.gif

Hrvatska
Hr flag.gif

Česká republika
Cz flag.gif

Danmark
Dk flag.gif

Polska
Pl flag.png

România
Ro flag.png

Sverige
Se flag.gif