Printer-friendly versionPrinter-friendly version

Time series models (in the time domain) involve lagged terms and may involve differenced data to account for trend.  There are useful notations used for each.

Backshift Operator

Using B before either a value of the series xt or an error term wt means to move that element back one time.  For instance,

\(Bx_t = x_{t-1}\).

A “power” of B means to repeatedly apply the backshift in order to move back a number of time periods that equals the “power.”  As an example,

\(B^2 x_t = x_{t-2}\).

\( x_{t-2}\)  represents xt two units back in time.  \(B^k x_t = x_{t-k}\) represents xt  k units back in time.  The backshift operator B doesn’t operate on coefficients because they are fixed quantities that do not move in time.  For example, Bθ1 = θ1.

AR Models and the AR Polynomial

AR models can be written compactly using an “AR polynomial” involving coefficients and backshift operators.  Let p = the maximum order (lag) of the AR terms in the model.  The general form for an AR polynomial is

\(\Phi(B) = 1-\phi_1B- \dots - \phi_p B^p\).

Using the AR polynomial one way to write an AR model is

\(\Phi(B)x_t = \delta + w_t\).


Consider the AR(1) model xt = δ+φ1xt-1+wt where wt ~ iid N(0, σw2).  For an AR(1), the maximum lag = 1 so the AR polynomial is

\(\Phi(B) = 1-\phi_1B\)

and the model can be written

\((1-\phi_1B)x_t = \delta + w_t\).

To check that this works, we can multiply out the left side to get

\(x_t - \phi_1x_{t-1} = \delta +w_t\).

Then, swing the -φ1xt-1 over to the right side and we get

\(x_t = \delta + \phi_1x_{t-1}+w_t\).

An AR(2) model is \(x_t = \delta + \phi_1x_{t-1}+\phi_2x_{t-2}+w_t\).  That is, xt is a linear function of the values of x at the previous two lags.  The AR polynomial for an AR(2) model is

\(\Phi(B) = 1-\phi_1B-\phi_2B^2\).

The AR(2) model could be written as \(( 1-\phi_1B-\phi_2B^2) x_t = \delta + w_t\), or as \(\Phi(B)x_t = \delta + w_t\) with an additional explanation that \(\Phi(B) = 1-\phi_1B-\phi_2B^2\).

An AR(p) model is \(x_t = \delta + \phi_1x_{t-1}+\phi_2x_{t-2}+ ... + \phi_p x_{t-p} + w_t\), where \(\phi_1, \phi_2, ..., \phi_p\) are constants and may be greater than 1.  (Recall that \( |\phi_1| < 1 \) for an AR(1) model.)  Here xt is a linear function of the values of x at the previous p lags.  

A shorthand notation for the AR polynomial is Φ(B) and a general AR model might be written as \(\Phi(B)x_t = \delta + w_t\).  Of course, you would have to specify the order of the model somewhere on the side.

MA Models

  • A MA(1) model \(x_t = \mu + w_t + \theta_1 w_{t-1}\) could be written as \(x_t = \mu + (1+\theta_1B)w_t\) .  A factor such as \(1+\theta_1B\) is called the MA polynomial, and it is denoted as \(\Theta(B)\).
  • A MA(2) model is defined as \(x_t = \mu + w_t + \theta_1 w_{t-1} + \theta_2 w_{t-2}\) and could be written as \(x_t = \mu + (1+\theta_1B+\theta_2B^2)w_t\) .  Here, the MA polynomial is \(\Theta(B) = (1+\theta_1B+\theta_2B^2)\).

In general, the MA polynomial is \(\Theta(B) = (1+\theta_1B+\dots +\theta_qB^q)\), where \(q\) = maximum order (lag) for MA terms in the model.

In general, we can write an MA model as \(x_t - \mu = \Theta(B)w_t\).

Models with Both AR and MA Terms

A model that involves both AR and MA terms might be written \(\Phi(B)(x_t-\mu) = \Theta(B)w_t\) or possibly even

\[(x_t-\mu) = \frac{\Theta(B)}{\Phi(B)}w_t.\]

Note: Many textbooks and software programs define the MA polynomial with negative signs rather than positive signs as above.  This doesn’t change the properties of the model, or with a sample, the overall fit of the model.  It only changes the algebraic signs of the MA coefficients.  Always check to see how your software is defining the MA polynomial.  For example is the MA(1) polynomial 1 + θ1B or 1 - θ1B?


Often differencing is used to account for nonstationarity that occurs in the form of trend and/or seasonality.

The difference xt - xt-1 can be expressed as (1-B)xt.

An alternative notation for a difference is

\(\nabla = 1-B\).


\(\nabla x_t = (1-B)x_t = x_t-x_{t-1}\).

  • A subscript defines a difference of a lag equal to the subscript.  For instance,

\(\nabla_{12}x_t = x_t - x_{t-12}\).

This type of difference is often used with monthly data that exhibits seasonality.  The idea is that differences from the previous year may be, on average, about the same for each month of a year.

  • A superscript says to repeat the differencing the specified number of times.  As an example,

\(\nabla^2 x_t = (1-B)^2x_t = (1-2B+B^2)x_t = x_t -2x_{t-1}+x_{t-2}\).

In words, this is a first difference of the first differences.