Autoregressive Moving Average (ARMA): Artificial data

In [1]:
%matplotlib inline
In [2]:
from __future__ import print_function
import numpy as np
import statsmodels.api as sm
import pandas as pd
from statsmodels.tsa.arima_process import arma_generate_sample
np.random.seed(12345)

Generate some data from an ARMA process:

In [3]:
arparams = np.array([.75, -.25])
maparams = np.array([.65, .35])

The conventions of the arma_generate function require that we specify a 1 for the zero-lag of the AR and MA parameters and that the AR parameters be negated.

In [4]:
arparams = np.r_[1, -arparams]
maparams = np.r_[1, maparams]
nobs = 250
y = arma_generate_sample(arparams, maparams, nobs)

Now, optionally, we can add some dates information. For this example, we'll use a pandas time series.

In [5]:
dates = sm.tsa.datetools.dates_from_range('1980m1', length=nobs)
y = pd.Series(y, index=dates)
arma_mod = sm.tsa.ARMA(y, order=(2,2))
arma_res = arma_mod.fit(trend='nc', disp=-1)
/home/travis/build/statsmodels/statsmodels/statsmodels/tsa/base/tsa_model.py:165: ValueWarning: No frequency information was provided, so inferred frequency M will be used.
  % freq, ValueWarning)
In [6]:
print(arma_res.summary())
                              ARMA Model Results                              
==============================================================================
Dep. Variable:                      y   No. Observations:                  250
Model:                     ARMA(2, 2)   Log Likelihood                -353.445
Method:                       css-mle   S.D. of innovations              0.990
Date:                Sun, 24 Nov 2019   AIC                            716.891
Time:                        07:49:41   BIC                            734.498
Sample:                    01-31-1980   HQIC                           723.977
                         - 10-31-2000                                         
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
ar.L1.y        0.7904      0.134      5.878      0.000       0.527       1.054
ar.L2.y       -0.2314      0.113     -2.044      0.042      -0.453      -0.009
ma.L1.y        0.7007      0.127      5.525      0.000       0.452       0.949
ma.L2.y        0.4061      0.095      4.291      0.000       0.221       0.592
                                    Roots                                    
=============================================================================
                  Real          Imaginary           Modulus         Frequency
-----------------------------------------------------------------------------
AR.1            1.7079           -1.1851j            2.0788           -0.0965
AR.2            1.7079           +1.1851j            2.0788            0.0965
MA.1           -0.8628           -1.3108j            1.5693           -0.3427
MA.2           -0.8628           +1.3108j            1.5693            0.3427
-----------------------------------------------------------------------------
In [7]:
y.tail()
Out[7]:
2000-06-30    0.173211
2000-07-31   -0.048325
2000-08-31   -0.415804
2000-09-30    0.338725
2000-10-31    0.360838
dtype: float64
In [8]:
import matplotlib.pyplot as plt
fig, ax = plt.subplots(figsize=(10,8))
fig = arma_res.plot_predict(start='1999-06-30', end='2001-05-31', ax=ax)
legend = ax.legend(loc='upper left')