{ "cells": [ { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%run ../initscript.py\n", "import pandas as pd\n", "import numpy as np\n", "import scipy.stats as stats\n", "import matplotlib.pyplot as plt\n", "from ipywidgets import *\n", "%matplotlib inline" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Sequential Data Models" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The assumption that data points are assumed to be independent and identically distributed (i.i.d.) allows us to express the likelihood function as the product over all data points of the probability distribution evaluated at each data point. For many applications, the i.i.d. assumption may not hold such as\n", "\n", "- Time-series: stock market, speech, video analysis\n", "\n", "- Ordered: text, genes" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "One of the simplest ways to relax the i.i.d. assumption is to consider *Markov model*.\n", "\n", "A first order Markov chain of observations $\\mathbf{x}_t$ has joint distribution\n", "\n", "\\begin{align*}\n", "p(\\mathbf{x}_1, \\ldots, \\mathbf{x}_T) = p(\\mathbf{x}_1) \\prod_{t=2}^{T} p(\\mathbf{x}_t|\\mathbf{x}_{t-1}).\n", "\\end{align*}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Hidden Markov Models" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The joint distribution is\n", "\n", "\\begin{align*}\n", "p(\\mathbf{X}, \\mathbf{Z} | \\theta) = p(\\mathbf{z}_1|\\pmb{\\pi}) \\left( \\prod_{t=2}^{T} p(\\mathbf{z}_t|\\mathbf{z}_{t-1}, \\mathbf{A}) \\right) \\prod_{t=1}^{T} p(\\mathbf{x}_t|\\mathbf{z}_{t}, \\psi)\n", "\\end{align*}\n", "\n", "where $\\mathbf{X} = \\{\\mathbf{x}_1, \\ldots, \\mathbf{x}_T\\}, \\mathbf{Z}=\\{\\mathbf{z}_1, \\ldots, \\mathbf{z}_T\\}$ and $\\theta = \\{ \\pmb{\\pi}, \\mathbf{A}, \\psi \\}$ denotes the set of parameters governing the model." ] }, { "cell_type": "markdown", "metadata": { "heading_collapsed": true }, "source": [ "## Linear Dynamical System" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.3" }, "toc": { "base_numbering": 1, "nav_menu": {}, "number_sections": true, "sideBar": true, "skip_h1_title": false, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": false, "toc_position": {}, "toc_section_display": true, "toc_window_display": true }, "widgets": { "application/vnd.jupyter.widget-state+json": { "state": {}, "version_major": 2, "version_minor": 0 } } }, "nbformat": 4, "nbformat_minor": 2 }