Autoregressive Models: PACF and Information Criteria

15 minute read

Auto Regressive Models

In an Autoregressive model, or AR model, today’s value equals a mean plus a fraction phi of yesterday’s value, plus noise. Since there is only one lagged value on the right hand side, this is called an AR model of order 1, or simply an AR(1) model. If the AR parameter, phi, is one, then the process is a random walk. If phi is zero, then the process is white noise. In order for the process to be stable and stationary, phi has to be between -1 and +1.

As an example, suppose R(t) is a time series of stock returns. If phi is negative, then a positive return last period, at time t-1, implies that this period’s return is more likely to be negative. This is referred to as “mean reversion”. On the other hand, if phi is positive, then a positive return last period implies that this period’s return is expected to be positive. This is referred to as “momentum”.

AR(1) model parameters

Here are four simulated time series with different AR parameters. When phi equals 0.9, it looks close to a random walk. When phi equals -0.9, the process looks more erratic - a large positive value is usually followed by a large negative one. The bottom two are similar, but are less exaggerated and closer to white noise.

Here are four autocorrelation functions for different AR parameters. The autocorrelation decays exponentially at a rate of phi. Therefore if phi is 0.9, the lag-1 autocorrelation is 0.9, the lag-2 autocorrelation is (0.9)^2, the lag-3 autocorrelation is (0.9)^3, etc. When phi is negative, the autocorrelation function still decays exponentially, but the signs of the autocorrelation function reverse at each lag.

Higher order AR models

So far, we’ve been only looking at AR(1) models. The model can be extended to include more lagged values and more phi parameters. Here we show an AR(1), an AR(2), and an AR(3).

Simulating an AR process

Often, if you want to study and understand a pure AR process, it is useful to work with simulated data. statsmodels provides modules for simulating AR processes. First, import the class, ArmaProcess. Then define the order and parameters of the AR process.

Simulate AR(1) Time Series using arima_process module

We will simulate and plot a few AR(1) time series, each with a different parameter, phi, using the arima_process module in statsmodels. First we will look at an AR(1) model with a large positive and a large negative phi.

There are a few conventions when using the arima_process module that require some explanation. First, these routines were made very generally to handle both AR and MA models.

Second, when inputting the coefficients we must include the zero-lag coefficient of 1, and the sign of the other coefficient is the opposite of what we have been using. For example, for an AR(1) process with phi = 0.9, the second element of the ar array should be the opposite sign, -0.9. This is consistent with the time series literature in the field of signal processing. We also have to input the MA parameters. We will learn about MA models later, so for now, just ignore the MA part. Then, we create an instance of the class ArmaProcess. To simulate data, use the method generate_sample, with the number of simulated samples as an argument.

So, for example, for an AR(1) process with phi = 0.9, the array representing the AR parameters would be ar = np.array([1, -0.9]).

import matplotlib.pyplot as plt
import numpy as np
# import the module for simulating data
from statsmodels.tsa.arima_process import ArmaProcess

plt.figure(figsize=(10, 10))

# Plot 1: AR parameter = +0.9
ar1 = np.array([1, -0.9])
ma1 = np.array([1])
AR_object1 = ArmaProcess(ar1, ma1)
simulated_data_1 = AR_object1.generate_sample(nsample=1000)

# Plot 2: AR parameter = -0.9
ar2 = np.array([1, 0.9])
ma2 = np.array([1])
AR_object2 = ArmaProcess(ar2, ma2)
simulated_data_2 = AR_object2.generate_sample(nsample=1000)


The two AR parameters produce very different looking time series plots, but next we’ll see how to really be able to distinguish the time series.

Compare the ACF for Several AR Time Series

The autocorrelation function decays exponentially for an AR time series at a rate of the AR parameter. For example, if the AR parameter,phi = 0.9 , the first-lag autocorrelation will be 0.9, the second-lag will be (0.9)^2, the third-lag will be (0.9)^3, etc. A smaller AR parameter will have a steeper decay. For a negative AR parameter, say -0.9, the decay will flip signs, so the first-lag autocorrelation will be -0.9, the second-lag will be positive and third will be negative, etc.

Let’s compute the autocorrelation function for each of the three simulated datasets using the plot_acf function with 20 lags (and suppress the confidence intervals by setting alpha=1).

# Import the plot_acf module from statsmodels
from import plot_acf

# Plot 1: AR parameter = +0.9
plot_acf(simulated_data_1, alpha=1, lags=20)

# Plot 2: AR parameter = -0.9
plot_acf(simulated_data_2, alpha=1, lags=20)

ar3 = np.array([1, -0.3])
ma3 = np.array([1])
AR_object3 = ArmaProcess(ar3, ma3)
simulated_data_3 = AR_object3.generate_sample(nsample=1000)

# Plot 3: AR parameter = +0.3
plot_acf(simulated_data_3, alpha=1, lags=20)




The ACF plots match what we predicted.

Estimating and forecasting an AR Model

statsmodels has another module for estimating the parameters of a given AR model. import ARMA, which is a class, and create an instance of that class called mod, with the arguments being the data that you’re trying to fit, and the order of the model. The order (1,0) means you’re fitting the data to an AR(1) model. An order (2,0) would mean you’re fitting the data to an AR(2) model. The second part of the order is the MA part, which will be discussed in the next chapter. Once you instantiate the class, you can use the method fit to estimate the model, and store the results in result.

To see the full output, use the summary method on result. The coefficients for the mean mu and AR(1) parameter phi are highlighted in red. In the simulated data, mu was zero and phi = 0.9, and you can see that the estimated parameters are very close to the true parameters.

If you just want to see the coefficients rather than the entire regression output, you can use the .params property, which returns an array of the fitted coefficients, mu and phi in this case.

Example: Estimate the AR parameter using simulated data

In this example we will estimate the AR(1) parameter, phi, of one of the simulated series that we generated earlier. Since the parameters are known for a simulated series, it is a good way to understand the estimation routines before applying it to real data.

For simulated_data_1 with a true phi of 0.9, we will print out the estimate of phi. In addition, we will also print out the entire output that is produced when we fit a time series, so we can get an idea of what other tests and summary statistics are available in statsmodels.

Create an instance of the ARMA class called mod using the simulated data simulated_data_1 and the order (p,q) of the model (in this case, for an AR(1)), is order=(1,0).

import warnings

# Import the ARMA module from statsmodels
from statsmodels.tsa.arima_model import ARMA

# Fit an AR(1) model to the first simulated data
mod = ARMA(simulated_data_1, order=(1, 0))
res =

# Print out summary information on the fit

# Print out the estimate for the constant and for phi
print("When the true phi=0.9,true mu=" + str(np.round(np.mean(simulated_data_1), 2)) +
      " the estimate of phi and mu are:")
                              ARMA Model Results                              
Dep. Variable:                      y   No. Observations:                 1000
Model:                     ARMA(1, 0)   Log Likelihood               -1429.721
Method:                       css-mle   S.D. of innovations              1.010
Date:                Sun, 28 Feb 2021   AIC                           2865.442
Time:                        18:42:08   BIC                           2880.166
Sample:                             0   HQIC                          2871.038

                 coef    std err          z      P>|z|      [0.025      0.975]
const          0.2416      0.321      0.753      0.451      -0.387       0.870
ar.L1.y        0.9013      0.014     66.041      0.000       0.875       0.928
                  Real          Imaginary           Modulus         Frequency
AR.1            1.1095           +0.0000j            1.1095            0.0000
When the true phi=0.9,true mu=0.25 the estimate of phi and mu are:
[0.24164318 0.90133825]

Notice how close the estimated parameter is to the true parameter.

Forecasting with an AR Model

In addition to estimating the parameters of a model that we did in the last section, we can also do forecasting, both in-sample and out-of-sample using statsmodels.

  • The in-sample is a forecast of the next data point using the data up to that point, and
  • The out-of-sample forecasts any number of data points in the future. These forecasts can be made using either the predict() method if you want the forecasts in the form of a series of data, or using the plot_predict() method if we want a plot of the forecasted data. We supply the starting point for forecasting and the ending point, which can be any number of data points after the data set ends.

For the simulated series simulated_data_1 with phi=0.9, we will plot in-sample and out-of-sample forecasts.

  • Create an instance of the ARMA class called mod using the simulated data simulated_data_1 and the order (p,q) of the model (in this case, for an AR(1) order=(1,0)
  • Fit the model mod using the method .fit() and save it in a results object called res
  • Plot the in-sample and out-of-sample forecasts of the data using the plot_predict() method
  • Start the forecast 10 data points before the end of the 1000 point series at 990, and end the forecast 10 data points after the end of the series at point 1010
# Import the ARMA module from statsmodels
from statsmodels.tsa.arima_model import ARMA

fig, ax = plt.subplots(figsize=(10, 5))

# Forecast the first AR(1) model
mod = ARMA(simulated_data_1, order=(1, 0))
res =
res.plot_predict(start=990, end=1010, ax=ax)


Notice how, when phi is high like here, the forecast gradually moves to the long term mean of zero, but if phi were low, it would move much quicker to the long term mean. As you can see below for phi = 0.3

fig, ax = plt.subplots(figsize=(10, 5))

# Forecast the first AR(1) model
mod = ARMA(simulated_data_3, order=(1, 0))
res =
res.plot_predict(start=990, end=1010, ax=ax)


Example: Forecast Interest Rates

We will now use the forecasting techniques and apply it to real data rather than simulated data. The annual data of 10-year interest rates going back 56 years, which is in a Series called annual_interest_rate_data. Being able to forecast interest rates is of enormous importance, not only for bond investors but also for individuals like new homeowners who must decide between fixed and floating rate mortgages.

In an earlier post, we saw that there is some mean reversion in interest rates over long horizons. In other words, when interest rates are high, they tend to drop and when they are low, they tend to rise over time. Currently they are below long-term rates, so they are expected to rise, but an AR model attempts to quantify how much they are expected to rise.

import pandas as pd
daily_rates = pd.read_csv('data/daily_interest_rates_us.csv', index_col=['DATE'], parse_dates=True)
daily_rates.columns = ['APR']
1950-01-01 1.5
1950-02-01 1.5
1950-03-01 1.5
1950-04-01 1.5
1950-05-01 1.5
annual_interest_rate_data = daily_rates.resample('A').mean()
1950-12-31 1.590833
1951-12-31 1.750000
1952-12-31 1.750000
1953-12-31 1.990000
1954-12-31 1.597500
# Import the ARMA module from statsmodels
from statsmodels.tsa.arima_model import ARMA

fig, ax = plt.subplots(figsize=(10, 5))

# Forecast interest rates using an AR(1) model
mod = ARMA(annual_interest_rate_data, order=(1, 0))
res =

# Plot the original series and the forecasted series
res.plot_predict(start='2000', end='2026', ax=ax)


According to an AR(1) model, 10-year interest rates are forecasted to rise from 0.56%, towards the end of 2026 to ~2.00% in five years.

Note: This is simulated data. I wish interest rates were really this low (0.56%) at the time of writing this post.

Is AR Model just a Random Walk?

Sometimes it is difficult to distinguish between a time series that is slightly mean reverting and a time series that does not mean revert at all, like a random walk. We will compare the ACF for the slightly mean-reverting interest rate series of the last section with a simulated random walk with the same number of observations.

We should notice when plotting the autocorrelation of these two series side-by-side that they look very similar if the AR model is Random Walk.

# Generate random steps with mean=0 and standard deviation=1
simulated_data = np.random.normal(loc=0, scale=1, size=annual_interest_rate_data.shape[0])
# Import the plot_acf module from statsmodels
from import plot_acf

# Plot the interest rate series and the simulated random walk series side-by-side
fig, axes = plt.subplots(1,2, figsize=(10,5))

# Plot the autocorrelation of the interest rate series in the top plot
fig = plot_acf(annual_interest_rate_data, alpha=1, lags=12, ax=axes[0])

# Plot the autocorrelation of the simulated random walk series in the bottom plot
fig = plot_acf(simulated_data, alpha=1, lags=12, ax=axes[1])

# Label axes
axes[0].set_title("Interest Rate Data")
axes[1].set_title("Simulated Random Walk Data")


It appears that the interest rates are indeed mean-reverting and not just a random walk.

Choosing the right model

In practice, we will ordinarily not be told the order of the model that we’re trying to estimate.

Identifying the order of an AR Model

There are two techniques that can help determine the order of the AR model:

  1. The Partial Autocorrelation Function, and
  2. The Information Criteria

The Partial Autocorrelation Function measures the incremental benefit of adding another lag.

Imagine running several regressions, where you regress returns on more and more lagged values. The coefficients in the red boxes represent the values of the partial autocorrelation function for different lags. For example, in the bottom row, the coefficient in the red box, phi 4-4, is the lag-4 value of the Partial Autocorrelation Function, and it represents how significant adding a fourth lag is when you already have three lags.

plot_pacf is the statsmodels function for plotting the partial autocorrelation function. The arguments are the same as that of the plot_acf module that we saw earlier. The input x is a series or array. The argument lags indicates how many lags of the parital autocorrelation function will be plotted. And the alpha argument sets the width of the confidence interval.

Comparison of PACF for different AR Models

These plots show the Partial Autocorrelation Function for AR models of different orders.

In the upper left, for an AR(1) model, only the lag-1 PACF is significantly different from zero. Similarly, for an AR(2) model, two lags are different from zero, and for and AR(3), three lags are different from zero. Finally, for White Noise, there are no lags that are significantly different from zero.

The need for information criteria

The more parameters in a model, the better the model will fit the data. But this can lead to overfitting of the data. This is where information criteria comes into play. The information criteria adjusts the goodness-of-fit of a model by imposing a penalty based on the number of parameters used. Two common adjusted goodness-of-fit measures are called the Akaike Information Criterion and the Bayesian Information Criterion.


This is the full output from estimating an AR(2) model. The AIC and BIC are highlighted in the red box.

To get the AIC and BIC statistics, you follow the same procedure from the last section to fit the data to a model. In the last section, we learned how to get the full output using summary or just the AR parameters using the params attribute. We can also get the AIC or BIC using those attributes.

Where to use information criteria?

In practice, the way to use the information criteria is to fit several models, each with a different number of parameters, and choose the one with the lowest Bayesian information criterion.

Suppose we are given a time series of data, and unknown to us, it was simulated from an AR(3) model. Here is a plot of the BIC when we fit the data to an AR(1) up to an AR(8) model. You can see that the lowest BIC occurs for an AR(3).

Example using PACF to identify the order of AR Model

One useful tool to identify the order of an AR model is to look at the Partial Autocorrelation Function (PACF). In this example, we will simulate two time series, an AR(1) and an AR(2), and calculate the sample PACF for each. We will notice that for an AR(1), the PACF should have a significant lag-1 value, and roughly zeros after that. And for an AR(2), the sample PACF should have significant lag-1 and lag-2 values, and zeros after that.

# Import the modules for simulating data and for plotting the PACF
from statsmodels.tsa.arima_process import ArmaProcess
from import plot_pacf

# Simulate AR(1) with phi=+0.6
ma = np.array([1])
ar = np.array([1, -0.6])
AR_object = ArmaProcess(ar, ma)
simulated_data_1 = AR_object.generate_sample(nsample=5000)

# Plot PACF for AR(1)
plot_pacf(simulated_data_1, lags=20)

# Simulate AR(2) with phi1=+0.6, phi2=+0.3
ma = np.array([1])
ar = np.array([1, -0.6, -0.3])
AR_object = ArmaProcess(ar, ma)
simulated_data_2 = AR_object.generate_sample(nsample=5000)

# Plot PACF for AR(2)
plot_pacf(simulated_data_2, lags=20)



Notice that the number of significant lags for the PACF indicate the order of the AR model

Estimate the Order of Model using BIC

Another tool to identify the order of a model is to look at the Akaike Information Criterion (AIC) and the Bayesian Information Criterion (BIC). These measures compute the goodness of fit with the estimated parameters, but apply a penalty function on the number of parameters in the model. We will take the AR(2) simulated data from above section, saved as simulated_data_2, and compute the BIC as you vary the order, p, in an AR(p) from 0 to 6.

# Import the module for estimating an ARMA model
from statsmodels.tsa.arima_model import ARMA

# Fit the data to an AR(p) for p = 0,...,6 , and save the BIC
BIC = np.zeros(7)
for p in range(7):
    mod = ARMA(simulated_data_2, order=(p,0))
    res =
# Save BIC for AR(p)    
    BIC[p] = res.bic

# Plot the BIC as a function of p
plt.plot(range(1,7), BIC[1:7], marker='o')
plt.xlabel('Order of AR Model')
plt.ylabel('Bayesian Information Criterion')


For an AR(2), the BIC achieves its minimum at p=2, which is what we expect.