[1]:

%run ../initscript.py
import pandas as pd
import numpy as np
import scipy.stats as stats
import matplotlib.pyplot as plt
from ipywidgets import *
%matplotlib inline

Sequential Data Models¶

The assumption that data points are assumed to be independent and identically distributed (i.i.d.) allows us to express the likelihood function as the product over all data points of the probability distribution evaluated at each data point. For many applications, the i.i.d. assumption may not hold such as

Time-series: stock market, speech, video analysis
Ordered: text, genes

One of the simplest ways to relax the i.i.d. assumption is to consider Markov model.

A first order Markov chain of observations \(\mathbf{x}_t\) has joint distribution

\begin{align*} p(\mathbf{x}_1, \ldots, \mathbf{x}_T) = p(\mathbf{x}_1) \prod_{t=2}^{T} p(\mathbf{x}_t|\mathbf{x}_{t-1}). \end{align*}

Hidden Markov Models¶

The joint distribution is

\begin{align*} p(\mathbf{X}, \mathbf{Z} | \theta) = p(\mathbf{z}_1|\pmb{\pi}) \left( \prod_{t=2}^{T} p(\mathbf{z}_t|\mathbf{z}_{t-1}, \mathbf{A}) \right) \prod_{t=1}^{T} p(\mathbf{x}_t|\mathbf{z}_{t}, \psi) \end{align*}

where \(\mathbf{X} = \{\mathbf{x}_1, \ldots, \mathbf{x}_T\}, \mathbf{Z}=\{\mathbf{z}_1, \ldots, \mathbf{z}_T\}\) and \(\theta = \{ \pmb{\pi}, \mathbf{A}, \psi \}\) denotes the set of parameters governing the model.

Sequential Data Models¶

Hidden Markov Models¶

Linear Dynamical System¶