{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "265cda99",
   "metadata": {},
   "source": [
    "\n",
    "<a id='lssm'></a>\n",
    "<div id=\"qe-notebook-header\" align=\"right\" style=\"text-align:right;\">\n",
    "        <a href=\"https://quantecon.org/\" title=\"quantecon.org\">\n",
    "                <img style=\"width:250px;display:inline;\" width=\"250px\" src=\"https://assets.quantecon.org/img/qe-menubar-logo.svg\" alt=\"QuantEcon\">\n",
    "        </a>\n",
    "</div>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "55ddbf53",
   "metadata": {},
   "source": [
    "# Linear State Space Models\n",
    "\n",
    "\n",
    "<a id='index-0'></a>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1895c876",
   "metadata": {},
   "source": [
    "## Contents\n",
    "\n",
    "- [Linear State Space Models](#Linear-State-Space-Models)  \n",
    "  - [Overview](#Overview)  \n",
    "  - [The Linear State Space Model](#The-Linear-State-Space-Model)  \n",
    "  - [Distributions and Moments](#Distributions-and-Moments)  \n",
    "  - [Stationarity and Ergodicity](#Stationarity-and-Ergodicity)  \n",
    "  - [Noisy Observations](#Noisy-Observations)  \n",
    "  - [Prediction](#Prediction)  \n",
    "  - [Code](#Code)  \n",
    "  - [Exercises](#Exercises)  "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "090a90c7",
   "metadata": {},
   "source": [
    "> “We may regard the present state of the universe as the effect of its past and the cause of its future” – Marquis de Laplace\n",
    "\n",
    "\n",
    "In addition to what’s in Anaconda, this lecture will need the following libraries:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "49b0013b",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "!pip install quantecon"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1cd60837",
   "metadata": {},
   "source": [
    "## Overview\n",
    "\n",
    "This lecture introduces the **linear state space** dynamic system.\n",
    "\n",
    "The linear state space system is a generalization of the scalar AR(1) process [we studied before](https://intro.quantecon.org/ar1_processes.html).\n",
    "\n",
    "This model is a workhorse that carries a powerful theory of prediction.\n",
    "\n",
    "Its many applications include:\n",
    "\n",
    "- representing dynamics of higher-order linear systems  \n",
    "- predicting the position of a system $ j $ steps into the future  \n",
    "- predicting a geometric sum of future values of a variable like  \n",
    "  - non-financial income  \n",
    "  - dividends on a stock  \n",
    "  - the money supply  \n",
    "  - a government deficit or surplus, etc.  \n",
    "- key ingredient of useful models  \n",
    "  - Friedman’s permanent income model of consumption smoothing.  \n",
    "  - Barro’s model of smoothing total tax collections.  \n",
    "  - Rational expectations version of Cagan’s model of hyperinflation.  \n",
    "  - Sargent and Wallace’s “unpleasant monetarist arithmetic,” etc.  \n",
    "\n",
    "\n",
    "Let’s start with some imports:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "fc05225d",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "import matplotlib.pyplot as plt\n",
    "import numpy as np\n",
    "from quantecon import LinearStateSpace\n",
    "from scipy.stats import norm\n",
    "import random"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2a612ebb",
   "metadata": {},
   "source": [
    "## The Linear State Space Model\n",
    "\n",
    "\n",
    "<a id='index-1'></a>\n",
    "The objects in play are:\n",
    "\n",
    "- An $ n \\times 1 $ vector $ x_t $ denoting the **state** at time $ t = 0, 1, 2, \\ldots $.  \n",
    "- An IID sequence of $ m \\times 1 $ random vectors $ w_t \\sim N(0,I) $.  \n",
    "- A $ k \\times 1 $ vector $ y_t $ of **observations** at time $ t = 0, 1, 2, \\ldots $.  \n",
    "- An $ n \\times n $ matrix $ A $  called the **transition matrix**.  \n",
    "- An $ n \\times m $ matrix $ C $  called the **volatility matrix**.  \n",
    "- A $ k \\times n $ matrix $ G $ sometimes called the **output matrix**.  \n",
    "\n",
    "\n",
    "Here is the linear state-space system\n",
    "\n",
    "\n",
    "<a id='equation-st-space-rep'></a>\n",
    "$$\n",
    "\\begin{aligned}\n",
    "    x_{t+1} & =  A x_t + C w_{t+1}   \\\\\n",
    "    y_t &  =  G x_t \\nonumber \\\\\n",
    "    x_0 & \\sim N(\\mu_0, \\Sigma_0) \\nonumber\n",
    "\\end{aligned} \\tag{21.1}\n",
    "$$\n",
    "\n",
    "\n",
    "<a id='lss-pgs'></a>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f40736a0",
   "metadata": {},
   "source": [
    "### Primitives\n",
    "\n",
    "The primitives of the model are\n",
    "\n",
    "1. the matrices $ A, C, G $  \n",
    "1. shock distribution, which we have specialized to $ N(0,I) $  \n",
    "1. the distribution of the initial condition $ x_0 $, which we have set to $ N(\\mu_0, \\Sigma_0) $  \n",
    "\n",
    "\n",
    "Given $ A, C, G $ and draws of $ x_0 $ and $ w_1, w_2, \\ldots $, the\n",
    "model [(21.1)](#equation-st-space-rep) pins down the values of the sequences $ \\{x_t\\} $ and $ \\{y_t\\} $.\n",
    "\n",
    "Even without these draws, the primitives 1–3 pin down the *probability distributions* of $ \\{x_t\\} $ and $ \\{y_t\\} $.\n",
    "\n",
    "Later we’ll see how to compute these distributions and their moments."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "bfa79c83",
   "metadata": {},
   "source": [
    "#### Martingale Difference Shocks\n",
    "\n",
    "\n",
    "<a id='index-2'></a>\n",
    "We’ve made the common assumption that the shocks are independent standardized normal vectors.\n",
    "\n",
    "But some of what we say will be valid under the assumption that $ \\{w_{t+1}\\} $ is a **martingale difference sequence**.\n",
    "\n",
    "A martingale difference sequence is a sequence that is zero mean when conditioned on past information.\n",
    "\n",
    "In the present case, since $ \\{x_t\\} $ is our state sequence, this means that it satisfies\n",
    "\n",
    "$$\n",
    "\\mathbb{E} [w_{t+1} | x_t, x_{t-1}, \\ldots ] = 0\n",
    "$$\n",
    "\n",
    "This is a weaker condition than that $ \\{w_t\\} $ is IID with $ w_{t+1} \\sim N(0,I) $."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2e1f7bf8",
   "metadata": {},
   "source": [
    "### Examples\n",
    "\n",
    "By appropriate choice of the primitives, a variety of dynamics can be represented in terms of the linear state space model.\n",
    "\n",
    "The following examples help to highlight this point.\n",
    "\n",
    "They also illustrate the wise dictum *finding the state is an art*.\n",
    "\n",
    "\n",
    "<a id='lss-sode'></a>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "551b8938",
   "metadata": {},
   "source": [
    "#### Second-order Difference Equation\n",
    "\n",
    "Let $ \\{y_t\\} $ be a deterministic sequence that satisfies\n",
    "\n",
    "\n",
    "<a id='equation-st-ex-1'></a>\n",
    "$$\n",
    "y_{t+1} =  \\phi_0 + \\phi_1 y_t + \\phi_2 y_{t-1}\n",
    "\\quad \\text{s.t.} \\quad\n",
    "y_0, y_{-1} \\text{ given} \\tag{21.2}\n",
    "$$\n",
    "\n",
    "To map [(21.2)](#equation-st-ex-1) into our state space system [(21.1)](#equation-st-space-rep), we set\n",
    "\n",
    "$$\n",
    "x_t=\n",
    "\\begin{bmatrix}\n",
    "    1 \\\\\n",
    "    y_t \\\\\n",
    "    y_{t-1}\n",
    "\\end{bmatrix}\n",
    "\\qquad\n",
    "A = \\begin{bmatrix}\n",
    "          1 & 0 & 0 \\\\\n",
    "          \\phi_0 & \\phi_1 & \\phi_2  \\\\\n",
    "          0 & 1 & 0\n",
    "    \\end{bmatrix}\n",
    "\\qquad\n",
    "C= \\begin{bmatrix}\n",
    "    0 \\\\\n",
    "    0 \\\\\n",
    "    0\n",
    "    \\end{bmatrix}\n",
    "\\qquad\n",
    "G = \\begin{bmatrix} 0 & 1 & 0 \\end{bmatrix}\n",
    "$$\n",
    "\n",
    "You can confirm that under these definitions, [(21.1)](#equation-st-space-rep) and [(21.2)](#equation-st-ex-1) agree.\n",
    "\n",
    "The next figure shows the dynamics of this process when $ \\phi_0 = 1.1, \\phi_1=0.8, \\phi_2 = -0.8, y_0 = y_{-1} = 1 $.\n",
    "\n",
    "\n",
    "<a id='lss-sode-fig'></a>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f7641ec2",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "def plot_lss(A,\n",
    "         C,\n",
    "         G,\n",
    "         n=3,\n",
    "         ts_length=50):\n",
    "\n",
    "    ar = LinearStateSpace(A, C, G, mu_0=np.ones(n))\n",
    "    x, y = ar.simulate(ts_length)\n",
    "\n",
    "    fig, ax = plt.subplots()\n",
    "    y = y.flatten()\n",
    "    ax.plot(y, 'b-', lw=2, alpha=0.7)\n",
    "    ax.grid()\n",
    "    ax.set_xlabel('time', fontsize=12)\n",
    "    ax.set_ylabel('$y_t$', fontsize=12)\n",
    "    plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e5549f0f",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "ϕ_0, ϕ_1, ϕ_2 = 1.1, 0.8, -0.8\n",
    "\n",
    "A = [[1,     0,     0  ],\n",
    "     [ϕ_0,   ϕ_1,   ϕ_2],\n",
    "     [0,     1,     0  ]]\n",
    "\n",
    "C = np.zeros((3, 1))\n",
    "G = [0, 1, 0]\n",
    "\n",
    "plot_lss(A, C, G)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "386113d9",
   "metadata": {},
   "source": [
    "Later you’ll be asked to recreate this figure."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d92ab2a0",
   "metadata": {},
   "source": [
    "#### Univariate Autoregressive Processes\n",
    "\n",
    "\n",
    "<a id='index-3'></a>\n",
    "We can use [(21.1)](#equation-st-space-rep) to represent the model\n",
    "\n",
    "\n",
    "<a id='equation-eq-ar-rep'></a>\n",
    "$$\n",
    "y_{t+1} = \\phi_1 y_{t} + \\phi_2 y_{t-1} + \\phi_3 y_{t-2} + \\phi_4  y_{t-3} + \\sigma w_{t+1} \\tag{21.3}\n",
    "$$\n",
    "\n",
    "where $ \\{w_t\\} $ is IID and standard normal.\n",
    "\n",
    "To put this in the linear state space format we take $ x_t = \\begin{bmatrix} y_t & y_{t-1} &  y_{t-2} &  y_{t-3} \\end{bmatrix}' $ and\n",
    "\n",
    "$$\n",
    "A =\n",
    "\\begin{bmatrix}\n",
    "    \\phi_1 & \\phi_2 & \\phi_3 & \\phi_4 \\\\\n",
    "    1 & 0 & 0 & 0 \\\\\n",
    "    0 & 1 & 0 & 0 \\\\\n",
    "    0 & 0 & 1 & 0\n",
    "\\end{bmatrix}\n",
    "\\qquad\n",
    "C = \\begin{bmatrix}\n",
    "        \\sigma \\\\\n",
    "        0 \\\\\n",
    "        0 \\\\\n",
    "        0\n",
    "    \\end{bmatrix}\n",
    "\\qquad\n",
    " G = \\begin{bmatrix}\n",
    "         1 & 0  & 0 & 0\n",
    "     \\end{bmatrix}\n",
    "$$\n",
    "\n",
    "The matrix $ A $ has the form of the *companion matrix* to the vector\n",
    "$ \\begin{bmatrix}\\phi_1 &  \\phi_2 & \\phi_3 & \\phi_4 \\end{bmatrix} $.\n",
    "\n",
    "The next figure shows the dynamics of this process when\n",
    "\n",
    "$$\n",
    "\\phi_1 = 0.5, \\phi_2 = -0.2, \\phi_3 = 0, \\phi_4 = 0.5, \\sigma = 0.2, y_0 = y_{-1} = y_{-2} =\n",
    "y_{-3} = 1\n",
    "$$\n",
    "\n",
    "\n",
    "<a id='lss-uap-fig'></a>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "977b4f4f",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "ϕ_1, ϕ_2, ϕ_3, ϕ_4 = 0.5, -0.2, 0, 0.5\n",
    "σ = 0.2\n",
    "\n",
    "A_1 = [[ϕ_1,   ϕ_2,   ϕ_3,   ϕ_4],\n",
    "       [1,     0,     0,     0  ],\n",
    "       [0,     1,     0,     0  ],\n",
    "       [0,     0,     1,     0  ]]\n",
    "\n",
    "C_1 = [[σ],\n",
    "       [0],\n",
    "       [0],\n",
    "       [0]]\n",
    "\n",
    "G_1 = [1, 0, 0, 0]\n",
    "\n",
    "plot_lss(A_1, C_1, G_1, n=4, ts_length=200)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a376cfd0",
   "metadata": {},
   "source": [
    "#### Vector Autoregressions\n",
    "\n",
    "\n",
    "<a id='index-4'></a>\n",
    "Now suppose that\n",
    "\n",
    "- $ y_t $ is a $ k \\times 1 $ vector  \n",
    "- $ \\phi_j $ is a $ k \\times k $ matrix and  \n",
    "- $ w_t $ is $ k \\times 1 $  \n",
    "\n",
    "\n",
    "Then [(21.3)](#equation-eq-ar-rep) is termed a *vector autoregression*.\n",
    "\n",
    "To map this into [(21.1)](#equation-st-space-rep), we set\n",
    "\n",
    "$$\n",
    "x_t =\n",
    "\\begin{bmatrix}\n",
    "     y_t \\\\\n",
    "     y_{t-1} \\\\\n",
    "     y_{t-2} \\\\\n",
    "     y_{t-3}\n",
    " \\end{bmatrix}\n",
    "\\quad\n",
    "A =\n",
    "\\begin{bmatrix}\n",
    "\\phi_1 & \\phi_2 & \\phi_3 & \\phi_4 \\\\\n",
    "I & 0        & 0        & 0       \\\\\n",
    "0 & I        & 0        & 0       \\\\\n",
    "0 & 0        & I        & 0\n",
    "\\end{bmatrix}\n",
    "\\quad\n",
    "C =\n",
    "\\begin{bmatrix}\n",
    "     \\sigma \\\\\n",
    "     0 \\\\\n",
    "     0 \\\\\n",
    "     0\n",
    " \\end{bmatrix}\n",
    "\\quad\n",
    "G =\n",
    "\\begin{bmatrix}\n",
    "     I & 0 & 0 & 0\n",
    " \\end{bmatrix}\n",
    "$$\n",
    "\n",
    "where $ I $ is the $ k \\times k $ identity matrix and $ \\sigma $ is a $ k \\times k $ matrix."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9d7cd766",
   "metadata": {},
   "source": [
    "#### Seasonals\n",
    "\n",
    "\n",
    "<a id='index-5'></a>\n",
    "We can use [(21.1)](#equation-st-space-rep) to represent\n",
    "\n",
    "1. the *deterministic seasonal* $ y_t = y_{t-4} $  \n",
    "1. the *indeterministic seasonal* $ y_t = \\phi_4 y_{t-4} + w_t $  \n",
    "\n",
    "\n",
    "In fact, both are special cases of [(21.3)](#equation-eq-ar-rep).\n",
    "\n",
    "With the deterministic seasonal, the transition matrix becomes\n",
    "\n",
    "$$\n",
    "A = \\begin{bmatrix}\n",
    "        0 & 0 & 0 & 1 \\\\\n",
    "        1 & 0 & 0 & 0 \\\\\n",
    "        0 & 1 & 0 & 0 \\\\\n",
    "        0 & 0 & 1 & 0\n",
    "    \\end{bmatrix}\n",
    "$$\n",
    "\n",
    "It is easy to check that $ A^4 = I $, which implies that $ x_t $ is strictly periodic with period 4:<sup><a href=#foot1 id=foot1-link>[1]</a></sup>\n",
    "\n",
    "$$\n",
    "x_{t+4} = x_t\n",
    "$$\n",
    "\n",
    "Such an $ x_t $ process can be used to model deterministic seasonals in quarterly time series.\n",
    "\n",
    "The *indeterministic* seasonal produces recurrent, but aperiodic, seasonal fluctuations."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9b57ba80",
   "metadata": {},
   "source": [
    "#### Time Trends\n",
    "\n",
    "\n",
    "<a id='index-6'></a>\n",
    "The model $ y_t = a t + b $ is known as a *linear time trend*.\n",
    "\n",
    "We can represent this model in the linear state space form by taking\n",
    "\n",
    "\n",
    "<a id='equation-lss-ltt'></a>\n",
    "$$\n",
    "A\n",
    "= \\begin{bmatrix}\n",
    "    1 & 1  \\\\\n",
    "    0 & 1\n",
    "  \\end{bmatrix}\n",
    "\\qquad\n",
    "C\n",
    "= \\begin{bmatrix}\n",
    "        0 \\\\\n",
    "        0\n",
    "  \\end{bmatrix}\n",
    "\\qquad\n",
    "G\n",
    "= \\begin{bmatrix}\n",
    "        a & b\n",
    "  \\end{bmatrix} \\tag{21.4}\n",
    "$$\n",
    "\n",
    "and starting at initial condition $ x_0 = \\begin{bmatrix} 0 & 1\\end{bmatrix}' $.\n",
    "\n",
    "In fact, it’s possible to use the state-space system to represent polynomial trends of any order.\n",
    "\n",
    "For instance, we can represent the model $ y_t = a t^2 + bt + c $ in the linear state space form by taking\n",
    "\n",
    "$$\n",
    "A\n",
    "= \\begin{bmatrix}\n",
    "    1 & 1 & 0 \\\\\n",
    "    0 & 1 & 1 \\\\\n",
    "    0 & 0 & 1\n",
    "  \\end{bmatrix}\n",
    "\\qquad\n",
    "C\n",
    "= \\begin{bmatrix}\n",
    "        0 \\\\\n",
    "        0 \\\\\n",
    "        0\n",
    "  \\end{bmatrix}\n",
    "\\qquad\n",
    "G\n",
    "= \\begin{bmatrix}\n",
    "        2a & a + b & c\n",
    "  \\end{bmatrix}\n",
    "$$\n",
    "\n",
    "and starting at initial condition $ x_0 = \\begin{bmatrix} 0 & 0 & 1 \\end{bmatrix}' $.\n",
    "\n",
    "It follows that\n",
    "\n",
    "$$\n",
    "A^t =\n",
    "\\begin{bmatrix}\n",
    " 1 & t & t(t-1)/2 \\\\\n",
    " 0 & 1 & t \\\\\n",
    " 0 & 0 & 1\n",
    "\\end{bmatrix}\n",
    "$$\n",
    "\n",
    "Then $ x_t^\\prime = \\begin{bmatrix} t(t-1)/2 &t & 1 \\end{bmatrix} $. You can now confirm that $ y_t = G x_t $ has the correct form."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "94693a8d",
   "metadata": {},
   "source": [
    "### Moving Average Representations\n",
    "\n",
    "\n",
    "<a id='index-7'></a>\n",
    "A nonrecursive expression for $ x_t $ as a function of\n",
    "$ x_0, w_1, w_2, \\ldots,  w_t $ can be found by using [(21.1)](#equation-st-space-rep) repeatedly to obtain\n",
    "\n",
    "\n",
    "<a id='equation-eqob5'></a>\n",
    "$$\n",
    "\\begin{aligned}\n",
    "   x_t & = Ax_{t-1} + Cw_t \\\\\n",
    "       & = A^2 x_{t-2} + ACw_{t-1} + Cw_t \\nonumber \\\\\n",
    "       & \\qquad \\vdots \\nonumber \\\\\n",
    "       & = \\sum_{j=0}^{t-1} A^j Cw_{t-j}  + A^t x_0 \\nonumber\n",
    "\\end{aligned} \\tag{21.5}\n",
    "$$\n",
    "\n",
    "Representation [(21.5)](#equation-eqob5) is a  *moving average* representation.\n",
    "\n",
    "It expresses $ \\{x_t\\} $ as a linear function of\n",
    "\n",
    "1. current and past values of the  process $ \\{w_t\\} $ and  \n",
    "1. the initial condition $ x_0 $  \n",
    "\n",
    "\n",
    "As an example of a moving average representation, let the model be\n",
    "\n",
    "$$\n",
    "A\n",
    "= \\begin{bmatrix}\n",
    "    1 & 1  \\\\\n",
    "    0 & 1\n",
    "  \\end{bmatrix}\n",
    "\\qquad\n",
    "C\n",
    "= \\begin{bmatrix}\n",
    "        1 \\\\\n",
    "        0\n",
    "  \\end{bmatrix}\n",
    "$$\n",
    "\n",
    "You will be able to show that $ A^t = \\begin{bmatrix} 1 & t \\cr 0 & 1 \\end{bmatrix} $ and  $ A^j C = \\begin{bmatrix} 1 & 0 \\end{bmatrix}' $.\n",
    "\n",
    "Substituting into the moving average representation [(21.5)](#equation-eqob5), we obtain\n",
    "\n",
    "$$\n",
    "x_{1t} = \\sum_{j=0}^{t-1} w_{t-j} +\n",
    "\\begin{bmatrix}\n",
    " 1 & t\n",
    "\\end{bmatrix}\n",
    "x_0\n",
    "$$\n",
    "\n",
    "where $ x_{1t} $ is the first entry of $ x_t $.\n",
    "\n",
    "The first term on the right is a cumulated sum of martingale differences and is therefore a [martingale](https://en.wikipedia.org/wiki/Martingale_%28probability_theory%29).\n",
    "\n",
    "The second term is a translated linear function of time.\n",
    "\n",
    "For this reason, $ x_{1t} $ is called a *martingale with drift*."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c3290ce3",
   "metadata": {},
   "source": [
    "## Distributions and Moments\n",
    "\n",
    "\n",
    "<a id='index-9'></a>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2c50388b",
   "metadata": {},
   "source": [
    "### Unconditional Moments\n",
    "\n",
    "Using [(21.1)](#equation-st-space-rep), it’s easy to obtain expressions for the\n",
    "(unconditional) means of $ x_t $ and $ y_t $.\n",
    "\n",
    "We’ll explain what *unconditional* and *conditional* mean soon.\n",
    "\n",
    "Letting $ \\mu_t := \\mathbb{E} [x_t] $ and using linearity of expectations, we\n",
    "find that\n",
    "\n",
    "\n",
    "<a id='equation-lss-mut-linear-models'></a>\n",
    "$$\n",
    "\\mu_{t+1} = A \\mu_t\n",
    "\\quad \\text{with} \\quad \\mu_0 \\text{ given} \\tag{21.6}\n",
    "$$\n",
    "\n",
    "Here $ \\mu_0 $ is a primitive given in [(21.1)](#equation-st-space-rep).\n",
    "\n",
    "The variance-covariance matrix of $ x_t $ is $ \\Sigma_t := \\mathbb{E} [ (x_t - \\mu_t) (x_t - \\mu_t)'] $.\n",
    "\n",
    "Using $ x_{t+1} - \\mu_{t+1} = A (x_t - \\mu_t) + C w_{t+1} $, we can\n",
    "determine this matrix recursively via\n",
    "\n",
    "\n",
    "<a id='equation-eqsigmalaw-linear-models'></a>\n",
    "$$\n",
    "\\Sigma_{t+1}  = A \\Sigma_t A' + C C'\n",
    "\\quad \\text{with} \\quad \\Sigma_0 \\text{ given} \\tag{21.7}\n",
    "$$\n",
    "\n",
    "As with $ \\mu_0 $, the matrix $ \\Sigma_0 $ is a primitive given in [(21.1)](#equation-st-space-rep).\n",
    "\n",
    "As a matter of terminology, we will sometimes call\n",
    "\n",
    "- $ \\mu_t $ the *unconditional mean*  of $ x_t $  \n",
    "- $ \\Sigma_t $ the *unconditional variance-covariance matrix*  of $ x_t $  \n",
    "\n",
    "\n",
    "This is to distinguish $ \\mu_t $ and $ \\Sigma_t $ from related objects that use conditioning\n",
    "information, to be defined below.\n",
    "\n",
    "However, you should be aware that these “unconditional” moments do depend on\n",
    "the initial distribution $ N(\\mu_0, \\Sigma_0) $."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "14acbd9a",
   "metadata": {},
   "source": [
    "#### Moments of the Observables\n",
    "\n",
    "Using linearity of expectations again we have\n",
    "\n",
    "\n",
    "<a id='equation-lss-umy'></a>\n",
    "$$\n",
    "\\mathbb{E} [y_t] = \\mathbb{E} [G x_t] = G \\mu_t \\tag{21.8}\n",
    "$$\n",
    "\n",
    "The variance-covariance matrix of $ y_t $ is easily shown to be\n",
    "\n",
    "\n",
    "<a id='equation-lss-uvy'></a>\n",
    "$$\n",
    "\\textrm{Var} [y_t] = \\textrm{Var} [G x_t] = G \\Sigma_t G' \\tag{21.9}\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b415e515",
   "metadata": {},
   "source": [
    "### Distributions\n",
    "\n",
    "\n",
    "<a id='index-10'></a>\n",
    "In general, knowing the mean and variance-covariance matrix of a random vector\n",
    "is not quite as good as knowing the full distribution.\n",
    "\n",
    "However, there are some situations where these moments alone tell us all we\n",
    "need to know.\n",
    "\n",
    "These are situations in which the mean vector and covariance matrix are all of the  **parameters** that pin down the population distribution.\n",
    "\n",
    "One such situation is when the vector in question is Gaussian (i.e., normally\n",
    "distributed).\n",
    "\n",
    "This is the case here, given\n",
    "\n",
    "1. our Gaussian assumptions on the primitives  \n",
    "1. the fact that normality is preserved under linear operations  \n",
    "\n",
    "\n",
    "In fact, it’s [well-known](https://en.wikipedia.org/wiki/Multivariate_normal_distribution#Affine_transformation) that\n",
    "\n",
    "\n",
    "<a id='equation-lss-glig'></a>\n",
    "$$\n",
    "u \\sim N(\\bar u, S)\n",
    "\\quad \\text{and} \\quad\n",
    "v = a + B u\n",
    "\\implies\n",
    "v \\sim N(a + B \\bar u, B S B') \\tag{21.10}\n",
    "$$\n",
    "\n",
    "In particular, given our Gaussian assumptions on the primitives and the\n",
    "linearity of [(21.1)](#equation-st-space-rep) we can see immediately that  both $ x_t $ and\n",
    "$ y_t $ are  Gaussian for all $ t \\geq 0 $ <sup><a href=#fn-ag id=fn-ag-link>[2]</a></sup>.\n",
    "\n",
    "Since $ x_t $ is Gaussian, to find the distribution, all we need to do is\n",
    "find its mean and variance-covariance matrix.\n",
    "\n",
    "But in fact we’ve already done this, in [(21.6)](#equation-lss-mut-linear-models) and [(21.7)](#equation-eqsigmalaw-linear-models).\n",
    "\n",
    "Letting $ \\mu_t $ and $ \\Sigma_t $ be as defined by these equations,\n",
    "we have\n",
    "\n",
    "\n",
    "<a id='equation-lss-mgs-x'></a>\n",
    "$$\n",
    "x_t \\sim N(\\mu_t, \\Sigma_t) \\tag{21.11}\n",
    "$$\n",
    "\n",
    "By similar reasoning combined with [(21.8)](#equation-lss-umy) and [(21.9)](#equation-lss-uvy),\n",
    "\n",
    "\n",
    "<a id='equation-lss-mgs-y'></a>\n",
    "$$\n",
    "y_t \\sim N(G \\mu_t, G \\Sigma_t G') \\tag{21.12}\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "71b735a6",
   "metadata": {},
   "source": [
    "### Ensemble Interpretations\n",
    "\n",
    "How should we interpret the distributions defined by [(21.11)](#equation-lss-mgs-x)–[(21.12)](#equation-lss-mgs-y)?\n",
    "\n",
    "Intuitively, the probabilities in a distribution correspond to relative frequencies in a large population drawn from that distribution.\n",
    "\n",
    "Let’s apply this idea to our setting, focusing on the distribution of $ y_T $ for fixed $ T $.\n",
    "\n",
    "We can generate independent draws of $ y_T $ by repeatedly simulating the\n",
    "evolution of the system up to time $ T $, using an independent set of\n",
    "shocks each time.\n",
    "\n",
    "The next figure shows 20 simulations, producing 20 time series for $ \\{y_t\\} $, and hence 20 draws of $ y_T $.\n",
    "\n",
    "The system in question is the univariate autoregressive model [(21.3)](#equation-eq-ar-rep).\n",
    "\n",
    "The values of $ y_T $ are represented by black dots in the left-hand figure"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4ffd6f57",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "def cross_section_plot(A,\n",
    "                   C,\n",
    "                   G,\n",
    "                   T=20,                 # Set the time\n",
    "                   ymin=-0.8,\n",
    "                   ymax=1.25,\n",
    "                   sample_size = 20,     # 20 observations/simulations\n",
    "                   n=4):                 # The number of dimensions for the initial x0\n",
    "\n",
    "    ar = LinearStateSpace(A, C, G, mu_0=np.ones(n))\n",
    "\n",
    "    fig, axes = plt.subplots(1, 2, figsize=(16, 5))\n",
    "\n",
    "    for ax in axes:\n",
    "        ax.grid(alpha=0.4)\n",
    "        ax.set_ylim(ymin, ymax)\n",
    "\n",
    "    ax = axes[0]\n",
    "    ax.set_ylim(ymin, ymax)\n",
    "    ax.set_ylabel('$y_t$', fontsize=12)\n",
    "    ax.set_xlabel('time', fontsize=12)\n",
    "    ax.vlines((T,), -1.5, 1.5)\n",
    "\n",
    "    ax.set_xticks((T,))\n",
    "    ax.set_xticklabels(('$T$',))\n",
    "\n",
    "    sample = []\n",
    "    for i in range(sample_size):\n",
    "        rcolor = random.choice(('c', 'g', 'b', 'k'))\n",
    "        x, y = ar.simulate(ts_length=T+15)\n",
    "        y = y.flatten()\n",
    "        ax.plot(y, color=rcolor, lw=1, alpha=0.5)\n",
    "        ax.plot((T,), (y[T],), 'ko', alpha=0.5)\n",
    "        sample.append(y[T])\n",
    "\n",
    "    y = y.flatten()\n",
    "    axes[1].set_ylim(ymin, ymax)\n",
    "    axes[1].set_ylabel('$y_t$', fontsize=12)\n",
    "    axes[1].set_xlabel('relative frequency', fontsize=12)\n",
    "    axes[1].hist(sample, bins=16, density=True, orientation='horizontal', alpha=0.5)\n",
    "    plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "be101e65",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "ϕ_1, ϕ_2, ϕ_3, ϕ_4 = 0.5, -0.2, 0, 0.5\n",
    "σ = 0.1\n",
    "\n",
    "A_2 = [[ϕ_1, ϕ_2, ϕ_3, ϕ_4],\n",
    "       [1,     0,     0,     0],\n",
    "       [0,     1,     0,     0],\n",
    "       [0,     0,     1,     0]]\n",
    "\n",
    "C_2 = [[σ], [0], [0], [0]]\n",
    "\n",
    "G_2 = [1, 0, 0, 0]\n",
    "\n",
    "cross_section_plot(A_2, C_2, G_2)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "14c22224",
   "metadata": {},
   "source": [
    "In the right-hand figure, these values are converted into a rotated histogram\n",
    "that shows relative frequencies from our sample of 20 $ y_T $’s.\n",
    "\n",
    "Here is another figure, this time with 100 observations"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3d88319a",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "t = 100\n",
    "cross_section_plot(A_2, C_2, G_2, T=t)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f1877c8c",
   "metadata": {},
   "source": [
    "Let’s now try with 500,000 observations, showing only the histogram (without rotation)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "6da7d0d8",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "T = 100\n",
    "ymin=-0.8\n",
    "ymax=1.25\n",
    "sample_size = 500_000\n",
    "\n",
    "ar = LinearStateSpace(A_2, C_2, G_2, mu_0=np.ones(4))\n",
    "fig, ax = plt.subplots()\n",
    "x, y = ar.simulate(sample_size)\n",
    "mu_x, mu_y, Sigma_x, Sigma_y, Sigma_yx = ar.stationary_distributions()\n",
    "f_y = norm(loc=float(mu_y.item()), scale=float(np.sqrt(Sigma_y.item())))\n",
    "y = y.flatten()\n",
    "ygrid = np.linspace(ymin, ymax, 150)\n",
    "\n",
    "ax.hist(y, bins=50, density=True, alpha=0.4)\n",
    "ax.plot(ygrid, f_y.pdf(ygrid), 'k-', lw=2, alpha=0.8, label='true density')\n",
    "ax.set_xlim(ymin, ymax)\n",
    "ax.set_xlabel('$y_t$', fontsize=12)\n",
    "ax.set_ylabel('relative frequency', fontsize=12)\n",
    "ax.legend(fontsize=12)\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2f8db530",
   "metadata": {},
   "source": [
    "The black line is the population density of $ y_T $ calculated from [(21.12)](#equation-lss-mgs-y).\n",
    "\n",
    "The histogram and population distribution are close, as expected.\n",
    "\n",
    "By looking at the figures and experimenting with parameters, you will gain a\n",
    "feel for how the population distribution depends on the model primitives [listed above](#lss-pgs), as intermediated by the distribution’s parameters."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "937206e1",
   "metadata": {},
   "source": [
    "#### Ensemble Means\n",
    "\n",
    "In the preceding figure, we approximated the population distribution of $ y_T $ by\n",
    "\n",
    "1. generating $ I $ sample paths (i.e., time series) where $ I $ is a large number  \n",
    "1. recording each observation $ y^i_T $  \n",
    "1. histogramming this sample  \n",
    "\n",
    "\n",
    "Just as the histogram approximates the population distribution, the *ensemble* or\n",
    "*cross-sectional average*\n",
    "\n",
    "$$\n",
    "\\bar y_T := \\frac{1}{I} \\sum_{i=1}^I y_T^i\n",
    "$$\n",
    "\n",
    "approximates the expectation $ \\mathbb{E} [y_T] = G \\mu_T $ (as implied by the law of large numbers).\n",
    "\n",
    "Here’s a simulation comparing the ensemble averages and population means at time points $ t=0,\\ldots,50 $.\n",
    "\n",
    "The parameters are the same as for the preceding figures,\n",
    "and the sample size is relatively small ($ I=20 $).\n",
    "\n",
    "\n",
    "<a id='lss-em-fig'></a>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4534ec2b",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "I = 20\n",
    "T = 50\n",
    "ymin = -0.5\n",
    "ymax = 1.15\n",
    "\n",
    "ar = LinearStateSpace(A_2, C_2, G_2, mu_0=np.ones(4))\n",
    "\n",
    "fig, ax = plt.subplots()\n",
    "\n",
    "ensemble_mean = np.zeros(T)\n",
    "for i in range(I):\n",
    "    x, y = ar.simulate(ts_length=T)\n",
    "    y = y.flatten()\n",
    "    ax.plot(y, 'c-', lw=0.8, alpha=0.5)\n",
    "    ensemble_mean = ensemble_mean + y\n",
    "\n",
    "ensemble_mean = ensemble_mean / I\n",
    "ax.plot(ensemble_mean, color='b', lw=2, alpha=0.8, label='$\\\\bar y_t$')\n",
    "m = ar.moment_sequence()\n",
    "\n",
    "population_means = []\n",
    "for t in range(T):\n",
    "    μ_x, μ_y, Σ_x, Σ_y = next(m)\n",
    "    population_means.append(float(μ_y.item()))\n",
    "\n",
    "ax.plot(population_means, color='g', lw=2, alpha=0.8, label=r'$G\\mu_t$')\n",
    "ax.set_ylim(ymin, ymax)\n",
    "ax.set_xlabel('time', fontsize=12)\n",
    "ax.set_ylabel('$y_t$', fontsize=12)\n",
    "ax.legend(ncol=2)\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "407997c6",
   "metadata": {},
   "source": [
    "The ensemble mean for $ x_t $ is\n",
    "\n",
    "$$\n",
    "\\bar x_T := \\frac{1}{I} \\sum_{i=1}^I x_T^i \\to \\mu_T\n",
    "\\qquad (I \\to \\infty)\n",
    "$$\n",
    "\n",
    "The limit $ \\mu_T $ is a  “long-run average”.\n",
    "\n",
    "(By *long-run average* we mean the average for an infinite ($ I = \\infty $)  number of sample $ x_T $’s)\n",
    "\n",
    "Another application of the law of large numbers assures us that\n",
    "\n",
    "$$\n",
    "\\frac{1}{I} \\sum_{i=1}^I (x_T^i - \\bar x_T) (x_T^i - \\bar x_T)' \\to \\Sigma_T\n",
    "\\qquad (I \\to \\infty)\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "270d99af",
   "metadata": {},
   "source": [
    "### Joint Distributions\n",
    "\n",
    "In the preceding discussion, we looked at the distributions of $ x_t $ and\n",
    "$ y_t $ in isolation.\n",
    "\n",
    "This gives us useful information but doesn’t allow us to answer questions like\n",
    "\n",
    "- what’s the probability that $ x_t \\geq 0 $ for all $ t $?  \n",
    "- what’s the probability that the process $ \\{y_t\\} $ exceeds some value $ a $ before falling below $ b $?  \n",
    "- etc., etc.  \n",
    "\n",
    "\n",
    "Such questions concern the *joint distributions* of these sequences.\n",
    "\n",
    "To compute the joint distribution of $ x_0, x_1, \\ldots, x_T $, recall\n",
    "that joint and conditional densities are linked by the rule\n",
    "\n",
    "$$\n",
    "p(x, y) = p(y \\, | \\, x) p(x)\n",
    "\\qquad \\text{(joint }=\\text{ conditional }\\times\\text{ marginal)}\n",
    "$$\n",
    "\n",
    "From this rule we get $ p(x_0, x_1) = p(x_1 \\,|\\, x_0) p(x_0) $.\n",
    "\n",
    "The Markov property $ p(x_t \\,|\\, x_{t-1}, \\ldots, x_0) =  p(x_t \\,|\\, x_{t-1}) $ and repeated applications of the preceding rule lead us to\n",
    "\n",
    "$$\n",
    "p(x_0, x_1, \\ldots, x_T) =  p(x_0) \\prod_{t=0}^{T-1} p(x_{t+1} \\,|\\, x_t)\n",
    "$$\n",
    "\n",
    "The marginal $ p(x_0) $ is just the primitive $ N(\\mu_0, \\Sigma_0) $.\n",
    "\n",
    "In view of [(21.1)](#equation-st-space-rep), the conditional densities are\n",
    "\n",
    "$$\n",
    "p(x_{t+1} \\,|\\, x_t) = N(Ax_t, C C')\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e67a2a9a",
   "metadata": {},
   "source": [
    "#### Autocovariance Functions\n",
    "\n",
    "An important object related to the joint distribution is the *autocovariance function*\n",
    "\n",
    "\n",
    "<a id='equation-eqnautodeff'></a>\n",
    "$$\n",
    "\\Sigma_{t+j, t} := \\mathbb{E} [ (x_{t+j} - \\mu_{t+j})(x_t - \\mu_t)' ] \\tag{21.13}\n",
    "$$\n",
    "\n",
    "Elementary calculations show that\n",
    "\n",
    "\n",
    "<a id='equation-eqnautocov'></a>\n",
    "$$\n",
    "\\Sigma_{t+j,t} = A^j \\Sigma_t \\tag{21.14}\n",
    "$$\n",
    "\n",
    "Notice that $ \\Sigma_{t+j,t} $ in general depends on both $ j $, the gap between the two dates, and $ t $, the earlier date."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e93f1702",
   "metadata": {},
   "source": [
    "## Stationarity and Ergodicity\n",
    "\n",
    "\n",
    "<a id='index-12'></a>\n",
    "Stationarity and ergodicity are two properties  that, when they hold,  greatly aid analysis of linear state space models.\n",
    "\n",
    "Let’s start with the intuition."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "cf4d81d0",
   "metadata": {},
   "source": [
    "### Visualizing Stability\n",
    "\n",
    "Let’s look at some more time series from the same model that we analyzed above.\n",
    "\n",
    "This picture shows cross-sectional distributions for $ y $ at times\n",
    "$ T, T', T'' $"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ea42837d",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "def cross_plot(A,\n",
    "            C,\n",
    "            G,\n",
    "            steady_state='False',\n",
    "            T0 = 10,\n",
    "            T1 = 50,\n",
    "            T2 = 75,\n",
    "            T4 = 100):\n",
    "\n",
    "    ar = LinearStateSpace(A, C, G, mu_0=np.ones(4))\n",
    "\n",
    "    if steady_state == 'True':\n",
    "        μ_x, μ_y, Σ_x, Σ_y, Σ_yx = ar.stationary_distributions()\n",
    "        ar_state = LinearStateSpace(A, C, G, mu_0=μ_x, Sigma_0=Σ_x)\n",
    "\n",
    "    ymin, ymax = -0.6, 0.6\n",
    "    fig, ax = plt.subplots()\n",
    "    ax.grid(alpha=0.4)\n",
    "    ax.set_ylim(ymin, ymax)\n",
    "    ax.set_ylabel('$y_t$', fontsize=12)\n",
    "    ax.set_xlabel('$time$', fontsize=12)\n",
    "\n",
    "    ax.vlines((T0, T1, T2), -1.5, 1.5)\n",
    "    ax.set_xticks((T0, T1, T2))\n",
    "    ax.set_xticklabels((\"$T$\", \"$T'$\", \"$T''$\"), fontsize=12)\n",
    "    for i in range(80):\n",
    "        rcolor = random.choice(('c', 'g', 'b'))\n",
    "\n",
    "        if steady_state == 'True':\n",
    "            x, y = ar_state.simulate(ts_length=T4)\n",
    "        else:\n",
    "            x, y = ar.simulate(ts_length=T4)\n",
    "\n",
    "        y = y.flatten()\n",
    "        ax.plot(y, color=rcolor, lw=0.8, alpha=0.5)\n",
    "        ax.plot((T0, T1, T2), (y[T0], y[T1], y[T2],), 'ko', alpha=0.5)\n",
    "    plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1c226275",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "cross_plot(A_2, C_2, G_2)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5af58ffd",
   "metadata": {},
   "source": [
    "Note how the time series “settle down” in the sense that the distributions at\n",
    "$ T' $ and $ T'' $ are relatively similar to each other — but unlike\n",
    "the distribution at $ T $.\n",
    "\n",
    "Apparently, the distributions of $ y_t $  converge to a fixed long-run\n",
    "distribution as $ t \\to \\infty $.\n",
    "\n",
    "When such a distribution exists it is called a *stationary distribution*."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9ea72a5b",
   "metadata": {},
   "source": [
    "### Stationary Distributions\n",
    "\n",
    "In our setting, a distribution $ \\psi_{\\infty} $ is said to be *stationary* for $ x_t $ if\n",
    "\n",
    "$$\n",
    "x_t \\sim \\psi_{\\infty}\n",
    "\\quad \\text{and} \\quad\n",
    "x_{t+1} = A x_t + C w_{t+1}\n",
    "\\quad \\implies \\quad\n",
    "x_{t+1} \\sim \\psi_{\\infty}\n",
    "$$\n",
    "\n",
    "Since\n",
    "\n",
    "1. in the present case, all distributions are Gaussian  \n",
    "1. a Gaussian distribution is pinned down by its mean and variance-covariance matrix  \n",
    "\n",
    "\n",
    "we can restate the definition as follows: $ \\psi_{\\infty} $ is stationary for $ x_t $ if\n",
    "\n",
    "$$\n",
    "\\psi_{\\infty}\n",
    "= N(\\mu_{\\infty}, \\Sigma_{\\infty})\n",
    "$$\n",
    "\n",
    "where $ \\mu_{\\infty} $ and $ \\Sigma_{\\infty} $ are fixed points of [(21.6)](#equation-lss-mut-linear-models) and [(21.7)](#equation-eqsigmalaw-linear-models) respectively."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2611484a",
   "metadata": {},
   "source": [
    "### Covariance Stationary Processes\n",
    "\n",
    "Let’s see what happens to the preceding figure if we start $ x_0 $ at the stationary distribution.\n",
    "\n",
    "\n",
    "<a id='lss-s-fig'></a>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5fa99cd6",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "cross_plot(A_2, C_2, G_2, steady_state='True')"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9f818608",
   "metadata": {},
   "source": [
    "Now the  differences in the observed distributions at $ T, T' $ and $ T'' $ come entirely from random fluctuations due to the finite sample size.\n",
    "\n",
    "By\n",
    "\n",
    "- our choosing $ x_0 \\sim N(\\mu_{\\infty}, \\Sigma_{\\infty}) $  \n",
    "- the definitions of $ \\mu_{\\infty} $ and $ \\Sigma_{\\infty} $ as fixed points of [(21.6)](#equation-lss-mut-linear-models) and [(21.7)](#equation-eqsigmalaw-linear-models) respectively  \n",
    "\n",
    "\n",
    "we’ve ensured that\n",
    "\n",
    "$$\n",
    "\\mu_t = \\mu_{\\infty}\n",
    "\\quad \\text{and} \\quad\n",
    "\\Sigma_t = \\Sigma_{\\infty}\n",
    "\\quad \\text{for all } t\n",
    "$$\n",
    "\n",
    "Moreover, in view of [(21.14)](#equation-eqnautocov), the autocovariance function takes the form $ \\Sigma_{t+j,t} = A^j \\Sigma_\\infty $, which depends on $ j $ but not on $ t $.\n",
    "\n",
    "This motivates the following definition.\n",
    "\n",
    "A  process $ \\{x_t\\} $ is said to be *covariance stationary* if\n",
    "\n",
    "- both $ \\mu_t $ and $ \\Sigma_t $ are constant in $ t $  \n",
    "- $ \\Sigma_{t+j,t} $ depends on the time gap $ j $ but not on time $ t $  \n",
    "\n",
    "\n",
    "In our setting, $ \\{x_t\\} $ will be covariance stationary if $ \\mu_0, \\Sigma_0, A, C $  assume values that  imply that none of $ \\mu_t, \\Sigma_t, \\Sigma_{t+j,t} $ depends on $ t $."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e9484fa7",
   "metadata": {},
   "source": [
    "### Conditions for Stationarity"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "841ff191",
   "metadata": {},
   "source": [
    "#### The Globally Stable Case\n",
    "\n",
    "The difference equation $ \\mu_{t+1} = A \\mu_t $ is known to have *unique*\n",
    "fixed point $ \\mu_{\\infty} = 0 $ if all eigenvalues of $ A $ have moduli strictly less than unity.\n",
    "\n",
    "That is, if  `(np.absolute(np.linalg.eigvals(A)) < 1).all() == True`.\n",
    "\n",
    "The difference equation [(21.7)](#equation-eqsigmalaw-linear-models) also has a unique fixed point in this case, and, moreover\n",
    "\n",
    "$$\n",
    "\\mu_t \\to \\mu_{\\infty} = 0\n",
    "\\quad \\text{and} \\quad\n",
    "\\Sigma_t \\to \\Sigma_{\\infty}\n",
    "\\quad \\text{as} \\quad t \\to \\infty\n",
    "$$\n",
    "\n",
    "regardless of the initial conditions $ \\mu_0 $ and $ \\Sigma_0 $.\n",
    "\n",
    "This is the *globally stable case* — see [these notes](https://python.quantecon.org/_static/lecture_specific/linear_models/iteration_notes.pdf) for more a theoretical treatment.\n",
    "\n",
    "However, global stability is more than we need for stationary solutions, and often more than we want.\n",
    "\n",
    "To illustrate, consider [our second order difference equation example](#lss-sode).\n",
    "\n",
    "Here the state is $ x_t = \\begin{bmatrix} 1 & y_t & y_{t-1} \\end{bmatrix}' $.\n",
    "\n",
    "Because of the constant first component in the state vector, we will never have $ \\mu_t \\to 0 $.\n",
    "\n",
    "How can we find stationary solutions that respect a constant state component?"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5563985c",
   "metadata": {},
   "source": [
    "#### Processes with a Constant State Component\n",
    "\n",
    "To investigate such a process, suppose that $ A $ and $ C $ take the\n",
    "form\n",
    "\n",
    "$$\n",
    "A\n",
    " = \\begin{bmatrix}\n",
    "    A_1 & a \\\\\n",
    "    0 & 1\n",
    "\\end{bmatrix}\n",
    " \\qquad\n",
    " C  = \\begin{bmatrix}\n",
    "     C_1 \\\\\n",
    "     0\n",
    "\\end{bmatrix}\n",
    "$$\n",
    "\n",
    "where\n",
    "\n",
    "- $ A_1 $ is an $ (n-1) \\times (n-1) $ matrix  \n",
    "- $ a $ is an $ (n-1) \\times 1 $ column vector  \n",
    "\n",
    "\n",
    "Let $ x_t = \\begin{bmatrix} x_{1t}' & 1 \\end{bmatrix}' $ where $ x_{1t} $ is $ (n-1) \\times 1 $.\n",
    "\n",
    "It follows  that\n",
    "\n",
    "$$\n",
    "\\begin{aligned}\n",
    "x_{1,t+1} & = A_1 x_{1t} + a + C_1 w_{t+1}\n",
    "\\end{aligned}\n",
    "$$\n",
    "\n",
    "Let $ \\mu_{1t} = \\mathbb{E} [x_{1t}] $ and take expectations on both sides of this expression to get\n",
    "\n",
    "\n",
    "<a id='equation-eqob29'></a>\n",
    "$$\n",
    "\\mu_{1,t+1} = A_1 \\mu_{1,t} + a \\tag{21.15}\n",
    "$$\n",
    "\n",
    "Assume now that the moduli of the eigenvalues of $ A_1 $ are all strictly less than one.\n",
    "\n",
    "Then [(21.15)](#equation-eqob29) has a unique stationary solution, namely,\n",
    "\n",
    "$$\n",
    "\\mu_{1\\infty} = (I-A_1)^{-1} a\n",
    "$$\n",
    "\n",
    "The stationary value of $ \\mu_t $ itself is then $ \\mu_\\infty := \\begin{bmatrix}\n",
    "\\mu_{1\\infty}' & 1 \\end{bmatrix}' $.\n",
    "\n",
    "The stationary values of $ \\Sigma_t $ and $ \\Sigma_{t+j,t} $ satisfy\n",
    "\n",
    "\n",
    "<a id='equation-eqnsigmainf'></a>\n",
    "$$\n",
    "\\begin{aligned}\n",
    "\\Sigma_\\infty  & = A \\Sigma_\\infty A' + C C' \\\\\n",
    "\\Sigma_{t+j,t} & = A^j \\Sigma_\\infty \\nonumber\n",
    "\\end{aligned} \\tag{21.16}\n",
    "$$\n",
    "\n",
    "Notice that here $ \\Sigma_{t+j,t} $ depends on the time gap $ j $ but not on calendar time $ t $.\n",
    "\n",
    "In conclusion, if\n",
    "\n",
    "- $ x_0 \\sim N(\\mu_{\\infty}, \\Sigma_{\\infty}) $ and  \n",
    "- the moduli of the eigenvalues of $ A_1 $ are all strictly less than unity  \n",
    "\n",
    "\n",
    "then the $ \\{x_t\\} $ process is covariance stationary, with constant state\n",
    "component.\n",
    "\n",
    ">**Note**\n",
    ">\n",
    ">If the eigenvalues of $ A_1 $ are less than unity in modulus, then\n",
    "(a) starting from any initial value, the mean and variance-covariance\n",
    "matrix both converge to their stationary values; and (b)\n",
    "iterations on [(21.7)](#equation-eqsigmalaw-linear-models) converge to the fixed point of the *discrete\n",
    "Lyapunov equation* in the first line of [(21.16)](#equation-eqnsigmainf)."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4ba298e2",
   "metadata": {},
   "source": [
    "### Ergodicity\n",
    "\n",
    "Let’s suppose that we’re working with a covariance stationary process.\n",
    "\n",
    "In this case, we know that the ensemble mean will converge to $ \\mu_{\\infty} $ as the sample size $ I $ approaches infinity."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9837c300",
   "metadata": {},
   "source": [
    "#### Averages over Time\n",
    "\n",
    "Ensemble averages across simulations are interesting theoretically, but in real life, we usually observe only a *single* realization $ \\{x_t, y_t\\}_{t=0}^T $.\n",
    "\n",
    "So now let’s take a single realization and form the time-series averages\n",
    "\n",
    "$$\n",
    "\\bar x := \\frac{1}{T} \\sum_{t=1}^T x_t\n",
    "\\quad \\text{and} \\quad\n",
    "\\bar y := \\frac{1}{T} \\sum_{t=1}^T y_t\n",
    "$$\n",
    "\n",
    "Do these time series averages converge to something interpretable in terms of our basic state-space representation?\n",
    "\n",
    "The answer depends on something called *ergodicity*.\n",
    "\n",
    "Ergodicity is the property that time series and ensemble averages coincide.\n",
    "\n",
    "More formally, ergodicity implies that time series sample averages converge to their\n",
    "expectation under the stationary distribution.\n",
    "\n",
    "In particular,\n",
    "\n",
    "- $ \\frac{1}{T} \\sum_{t=1}^T x_t \\to \\mu_{\\infty} $  \n",
    "- $ \\frac{1}{T} \\sum_{t=1}^T (x_t -\\bar x_T) (x_t - \\bar x_T)' \\to \\Sigma_\\infty $  \n",
    "- $ \\frac{1}{T} \\sum_{t=1}^T (x_{t+j} -\\bar x_T) (x_t - \\bar x_T)' \\to A^j \\Sigma_\\infty $  \n",
    "\n",
    "\n",
    "In our linear Gaussian setting, any covariance stationary process is also ergodic."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "15a55e64",
   "metadata": {},
   "source": [
    "## Noisy Observations\n",
    "\n",
    "In some settings, the observation equation $ y_t = Gx_t $ is modified to\n",
    "include an error term.\n",
    "\n",
    "Often this error term represents the idea that the true state can only be\n",
    "observed imperfectly.\n",
    "\n",
    "To include an error term in the observation we introduce\n",
    "\n",
    "- An IID sequence of $ \\ell \\times 1 $ random vectors $ v_t \\sim N(0,I) $.  \n",
    "- A $ k \\times \\ell $ matrix $ H $.  \n",
    "\n",
    "\n",
    "and extend the linear state-space system to\n",
    "\n",
    "\n",
    "<a id='equation-st-space-rep-noisy'></a>\n",
    "$$\n",
    "\\begin{aligned}\n",
    "    x_{t+1} & =  A x_t + C w_{t+1}   \\\\\n",
    "    y_t &  =  G x_t + H v_t \\nonumber \\\\\n",
    "    x_0 & \\sim N(\\mu_0, \\Sigma_0) \\nonumber\n",
    "\\end{aligned} \\tag{21.17}\n",
    "$$\n",
    "\n",
    "The sequence $ \\{v_t\\} $ is assumed to be independent of $ \\{w_t\\} $.\n",
    "\n",
    "The process $ \\{x_t\\} $ is not modified by noise in the observation\n",
    "equation and its moments, distributions and stability properties remain the same.\n",
    "\n",
    "The unconditional moments of $ y_t $ from [(21.8)](#equation-lss-umy) and [(21.9)](#equation-lss-uvy)\n",
    "now become\n",
    "\n",
    "\n",
    "<a id='equation-lss-umy-2'></a>\n",
    "$$\n",
    "\\mathbb{E} [y_t] = \\mathbb{E} [G x_t + H v_t] = G \\mu_t \\tag{21.18}\n",
    "$$\n",
    "\n",
    "The variance-covariance matrix of $ y_t $ is easily shown to be\n",
    "\n",
    "\n",
    "<a id='equation-lss-uvy-2'></a>\n",
    "$$\n",
    "\\textrm{Var} [y_t] = \\textrm{Var} [G x_t + H v_t] = G \\Sigma_t G' + HH' \\tag{21.19}\n",
    "$$\n",
    "\n",
    "The distribution of $ y_t $ is therefore\n",
    "\n",
    "$$\n",
    "y_t \\sim N(G \\mu_t, G \\Sigma_t G' + HH')\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b1c70931",
   "metadata": {},
   "source": [
    "## Prediction\n",
    "\n",
    "\n",
    "<a id='index-13'></a>\n",
    "The theory of prediction for linear state space systems is elegant and\n",
    "simple.\n",
    "\n",
    "\n",
    "<a id='ff-cm'></a>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "df3fb542",
   "metadata": {},
   "source": [
    "### Forecasting Formulas – Conditional Means\n",
    "\n",
    "The natural way to predict variables is to use conditional distributions.\n",
    "\n",
    "For example, the optimal forecast of $ x_{t+1} $ given information known at time $ t $ is\n",
    "\n",
    "$$\n",
    "\\mathbb{E}_t [x_{t+1}] := \\mathbb{E} [x_{t+1} \\mid x_t, x_{t-1}, \\ldots, x_0 ] = Ax_t\n",
    "$$\n",
    "\n",
    "The right-hand side follows from $ x_{t+1} = A x_t + C w_{t+1} $ and the\n",
    "fact that $ w_{t+1} $ is zero mean and independent of $ x_t, x_{t-1}, \\ldots, x_0 $.\n",
    "\n",
    "That $ \\mathbb{E}_t [x_{t+1}] = \\mathbb{E}[x_{t+1} \\mid x_t] $ is an implication of $ \\{x_t\\} $ having the *Markov property*.\n",
    "\n",
    "The one-step-ahead forecast error is\n",
    "\n",
    "$$\n",
    "x_{t+1} - \\mathbb{E}_t [x_{t+1}] = Cw_{t+1}\n",
    "$$\n",
    "\n",
    "The covariance matrix of the forecast error is\n",
    "\n",
    "$$\n",
    "\\mathbb{E} [ (x_{t+1} - \\mathbb{E}_t [ x_{t+1}] ) (x_{t+1} - \\mathbb{E}_t [ x_{t+1}])'] = CC'\n",
    "$$\n",
    "\n",
    "More generally, we’d like to compute the $ j $-step ahead forecasts $ \\mathbb{E}_t [x_{t+j}] $ and $ \\mathbb{E}_t [y_{t+j}] $.\n",
    "\n",
    "With a bit of algebra, we obtain\n",
    "\n",
    "$$\n",
    "x_{t+j} = A^j x_t + A^{j-1} C w_{t+1} + A^{j-2} C w_{t+2} +\n",
    "\\cdots + A^0 C w_{t+j}\n",
    "$$\n",
    "\n",
    "In view of the IID property, current and past state values provide no information about future values of the shock.\n",
    "\n",
    "Hence $ \\mathbb{E}_t[w_{t+k}] = \\mathbb{E}[w_{t+k}] = 0 $.\n",
    "\n",
    "It now follows from linearity of expectations that the $ j $-step ahead forecast of $ x $ is\n",
    "\n",
    "$$\n",
    "\\mathbb{E}_t [x_{t+j}] = A^j x_t\n",
    "$$\n",
    "\n",
    "The $ j $-step ahead forecast of $ y $ is therefore\n",
    "\n",
    "$$\n",
    "\\mathbb{E}_t [y_{t+j}]\n",
    "= \\mathbb{E}_t [G x_{t+j} + H v_{t+j}]\n",
    "= G A^j x_t\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a5bb764b",
   "metadata": {},
   "source": [
    "### Covariance of Prediction Errors\n",
    "\n",
    "It is useful to obtain the covariance matrix of the vector of  $ j $-step-ahead prediction errors\n",
    "\n",
    "\n",
    "<a id='equation-eqob8'></a>\n",
    "$$\n",
    "x_{t+j} - \\mathbb{E}_t [ x_{t+j}] = \\sum^{j-1}_{s=0} A^s C w_{t-s+j} \\tag{21.20}\n",
    "$$\n",
    "\n",
    "Evidently,\n",
    "\n",
    "\n",
    "<a id='equation-eqob9a'></a>\n",
    "$$\n",
    "V_j := \\mathbb{E}_t [ (x_{t+j} - \\mathbb{E}_t [x_{t+j}] ) (x_{t+j} - \\mathbb{E}_t [x_{t+j}] )^\\prime ] =   \\sum^{j-1}_{k=0} A^k C C^\\prime A^{k^\\prime} \\tag{21.21}\n",
    "$$\n",
    "\n",
    "$ V_j $ defined in [(21.21)](#equation-eqob9a) can be calculated recursively via $ V_1 = CC' $ and\n",
    "\n",
    "\n",
    "<a id='equation-eqob9b'></a>\n",
    "$$\n",
    "V_j = CC^\\prime + A V_{j-1} A^\\prime, \\quad j \\geq 2 \\tag{21.22}\n",
    "$$\n",
    "\n",
    "$ V_j $ is the *conditional covariance matrix* of the errors in forecasting\n",
    "$ x_{t+j} $, conditioned on time $ t $ information $ x_t $.\n",
    "\n",
    "Under particular conditions, $ V_j $ converges to\n",
    "\n",
    "\n",
    "<a id='equation-eqob10'></a>\n",
    "$$\n",
    "V_\\infty = CC' + A V_\\infty A' \\tag{21.23}\n",
    "$$\n",
    "\n",
    "Equation [(21.23)](#equation-eqob10) is an example of a *discrete Lyapunov* equation in the covariance matrix $ V_\\infty $.\n",
    "\n",
    "A sufficient condition for $ V_j $ to converge is that the eigenvalues of $ A $ be strictly less than one in modulus.\n",
    "\n",
    "Weaker sufficient conditions for convergence  associate eigenvalues equaling or exceeding one in modulus with elements of $ C $ that equal $ 0 $.\n",
    "\n",
    "\n",
    "<a id='lm-fgs'></a>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "822ed568",
   "metadata": {},
   "source": [
    "## Code\n",
    "\n",
    "Our preceding simulations and calculations are based on code in\n",
    "the file [lss.py](https://github.com/QuantEcon/QuantEcon.py/blob/master/quantecon/lss.py) from the [QuantEcon.py](http://quantecon.org/quantecon-py) package.\n",
    "\n",
    "The code implements a class for handling linear state space models (simulations, calculating moments, etc.).\n",
    "\n",
    "One Python construct you might not be familiar with is the use of a generator function in the method `moment_sequence()`.\n",
    "\n",
    "Go back and [read the relevant documentation](https://python-programming.quantecon.org/python_advanced_features.html#generators) if you’ve forgotten how generator functions work.\n",
    "\n",
    "Examples of usage are given in the solutions to the exercises."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "46d9f167",
   "metadata": {},
   "source": [
    "## Exercises"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c5a33c75",
   "metadata": {},
   "source": [
    "## Exercise 21.1\n",
    "\n",
    "In several contexts, we want to compute forecasts of  geometric sums of future random variables governed by the linear state-space system [(21.1)](#equation-st-space-rep).\n",
    "\n",
    "We want the following objects\n",
    "\n",
    "- Forecast of a geometric sum of future $ x $’s, or $ \\mathbb{E}_t \\left[ \\sum_{j=0}^\\infty \\beta^j x_{t+j} \\right] $.  \n",
    "- Forecast of a geometric sum of future $ y $’s, or $ \\mathbb{E}_t \\left[\\sum_{j=0}^\\infty \\beta^j y_{t+j} \\right] $.  \n",
    "\n",
    "\n",
    "These objects are important components of some famous and  interesting dynamic models.\n",
    "\n",
    "For example,\n",
    "\n",
    "- if $ \\{y_t\\} $ is a stream of dividends, then $ \\mathbb{E} \\left[\\sum_{j=0}^\\infty \\beta^j y_{t+j} | x_t \\right] $ is a model of a stock price  \n",
    "- if $ \\{y_t\\} $ is  the money supply, then $ \\mathbb{E} \\left[\\sum_{j=0}^\\infty \\beta^j y_{t+j} | x_t \\right] $ is a  model of the price level  \n",
    "\n",
    "\n",
    "Show that:\n",
    "\n",
    "$$\n",
    "\\mathbb{E}_t \\left[\\sum_{j=0}^\\infty \\beta^j x_{t+j} \\right] = [I - \\beta A]^{-1} x_t\n",
    "$$\n",
    "\n",
    "and\n",
    "\n",
    "$$\n",
    "\\mathbb{E}_t \\left[\\sum_{j=0}^\\infty \\beta^j y_{t+j} \\right] = G[I - \\beta A]^{-1} x_t\n",
    "$$\n",
    "\n",
    "what must the modulus for every eigenvalue of $ A $ be less than?"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "004bb85b",
   "metadata": {},
   "source": [
    "## Solution to[ Exercise 21.1](https://python.quantecon.org/#lss_ex1)\n",
    "\n",
    "Suppose that every eigenvalue of $ A $ has modulus strictly less than $ \\frac{1}{\\beta} $.\n",
    "\n",
    "It [then follows](https://python.quantecon.org/linear_algebra.html#la-neumann-remarks) that $ I + \\beta A + \\beta^2 A^2 + \\cdots = \\left[I - \\beta A \\right]^{-1} $.\n",
    "\n",
    "This leads to our formulas:\n",
    "\n",
    "- Forecast of a geometric sum of future $ x $’s  \n",
    "\n",
    "\n",
    "$$\n",
    "\\mathbb{E}_t \\left[\\sum_{j=0}^\\infty \\beta^j x_{t+j} \\right]\n",
    "= [I + \\beta A + \\beta^2 A^2 + \\cdots \\ ] x_t = [I - \\beta A]^{-1} x_t\n",
    "$$\n",
    "\n",
    "- Forecast of a geometric sum of future $ y $’s  \n",
    "\n",
    "\n",
    "$$\n",
    "\\mathbb{E}_t \\left[\\sum_{j=0}^\\infty \\beta^j y_{t+j} \\right]\n",
    "= G [I + \\beta A + \\beta^2 A^2 + \\cdots \\ ] x_t\n",
    "= G[I - \\beta A]^{-1} x_t\n",
    "$$\n",
    "\n",
    "<p><a id=foot1 href=#foot1-link><strong>[1]</strong></a> The eigenvalues of $ A $ are $ (1,-1, i,-i) $.\n",
    "\n",
    "<p><a id=fn-ag href=#fn-ag-link><strong>[2]</strong></a> The correct way to argue this is by induction.  Suppose that\n",
    "$ x_t $ is Gaussian.  Then [(21.1)](#equation-st-space-rep) and\n",
    "[(21.10)](#equation-lss-glig) imply that $ x_{t+1} $ is Gaussian.  Since $ x_0 $\n",
    "is assumed to be Gaussian, it follows that every $ x_t $ is Gaussian.\n",
    "Evidently, this implies that each $ y_t $ is Gaussian."
   ]
  }
 ],
 "metadata": {
  "date": 1751441884.181295,
  "filename": "linear_models.md",
  "kernelspec": {
   "display_name": "Python",
   "language": "python3",
   "name": "python3"
  },
  "title": "Linear State Space Models"
 },
 "nbformat": 4,
 "nbformat_minor": 5
}