{ "cells": [ { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/html": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", " show code\n", " " ], "text/plain": [ "" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%run ../initscript.py\n", "toggle()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Conic Programming" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Formulation" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let $A \\in \\Re^{n \\times m}$, $b \\in \\Re^m$ and $c \\in \\Re^n$. A standard conic programming problem takes the form\n", "\n", "\\begin{align}\n", "p^\\ast = \\min_{x \\in \\Re^n} c^\\intercal x : Ax = b, ~ x \\in K, \\nonumber\n", "\\end{align}\n", "\n", "where $K \\subseteq \\Re^n$ is a convex cone. That is, $K$ is a convex set with the property that for all $x \\in K$, $\\lambda x \\in K$ for all $\\lambda > 0$." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Semi-definite Programming (SDP)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Given symmetric matrics $C, A^\\ell \\in \\Re^{n \\times n}$ and $b_\\ell \\in \\Re^m$. Consider the SDP in standard form\n", "\n", "\\begin{align}\n", "p^* = \\min_{X} \\langle C, X\\rangle : \\langle A^\\ell, ~X\\rangle = b_\\ell, ~\\ell=1, \\ldots, L, ~X \\succeq 0 \\nonumber\n", "\\end{align}\n", "\n", "where $\\langle A, B\\rangle = \\text{Tr} (A^\\intercal B)$.\n", "\n", "\n", "SDP is a special case of conic programming because\n", "\n", "\\begin{align*}\n", "\\langle C, X\\rangle &= \\sum_{i=1}^{n} \\sum_{j=1}^{n} C_{ij} X_{ji} \\\\\n", "\\langle A^\\ell, X\\rangle &= \\sum_{i=1}^{n} \\sum_{j=1}^{n} A^\\ell_{ij} X_{ji} \\\\\n", "X \\in \\text{ cone } K &= \\{ X \\in \\Re^{n \\times n}: v^\\intercal X v \\ge 0,\\ \\forall v \\in \\Re^n \\}\n", "\\end{align*}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Second-Order Cone Programming (SOCP)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A standard form for the SOCP model is\n", "\n", "\\begin{align*}\n", "p^\\ast = \\min_{x \\in \\mathbb{R}^n} c^\\intercal x : Ax = b, \\lVert C_\\ell x + d_\\ell \\rVert_2 \\le e^\\intercal_\\ell x + f_\\ell,\\ \\ell = 1, \\ldots, L.\n", "\\end{align*}\n", "\n", "SOCP is a special case of conic programming with cone\n", "\n", "\\begin{align*}\n", "\\mathcal{L}^{n+1} = \\left\\{ (y,t) | \\lVert y \\rVert_2 \\le t \\right\\}\n", "\\end{align*}\n", "\n", "and $(C_\\ell x + d_\\ell, e^\\intercal_\\ell x + f_\\ell) \\in \\mathcal{L}^{n+1}$.\n", "\n", "SOCP is a special case of SDP." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", " show comment\n", " " ], "text/plain": [ "" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "hide_comment()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let us use Schur complemets to show that SOCP is a special case of SDP.\n", "\n", "**Schur complement**: Given a symmetric block matrix\n", "\\begin{align*}\n", "X = \\left( \\begin{array}{ll}\n", "A_{p\\times p} & B \\\\\n", "B^\\intercal & C_{q\\times q}\n", "\\end{array} \\right)\n", "\\end{align*}\n", "\n", "with det$(A) \\neq 0$, the matrix $S = C - B^\\intercal A^{-1} B$ is the Schur complement of $A$ in $X$.\n", "\n", "**Lemma**: If $A \\succ 0$, then $X \\succeq 0 \\Leftrightarrow S \\succeq 0$.\n", "\n", "**Proof**:\n", "Let $f_v = \\min_u f(u, v)$ where \n", "\n", "\\begin{align*}\n", "f(u, v) = \\left( u^\\intercal, v^\\intercal \\right) \\left( \\begin{array}{ll}\n", "A_{p\\times p} & B \\\\\n", "B^\\intercal & C_{q\\times q}\n", "\\end{array} \\right)\n", "\\left( \\begin{array}{l}\n", "u \\\\\n", "v\n", "\\end{array} \\right)\n", "= u^\\intercal A u + 2 v^\\intercal B^\\intercal u + v^\\intercal C v. \n", "\\end{align*}\n", "\n", "Since $A \\succ 0$, $f$ is strictly convex in $u$ and it exists a unique global optimal as follows\n", "\n", "\\begin{align*}\n", "\\frac{\\partial f}{\\partial u} & = 2 A u + 2 B v = 0 \\Rightarrow u^* = - A^{-1} B v \\\\\n", "f_v &= v^\\intercal B^\\intercal A^{-1} B v - 2 v^\\intercal B^\\intercal A^{-1} B v + v^\\intercal C v \\\\\n", "&= v^\\intercal (C - B^\\intercal A^{-1} B) v = v^\\intercal S v\n", "\\end{align*}\n", "\n", "($\\Rightarrow$) If $S \\not\\succeq 0$, then $\\exists v_0$ such that $v_0^\\intercal S v_0 < 0 \\Rightarrow f_{v_0} < 0$. We have a contradiction because $f_{v_0} = z^\\intercal X z < 0$ with \n", "\n", "\\begin{align*}\n", "z = \\left( \\begin{array}{l}\n", "u \\\\\n", "v\n", "\\end{array} \\right) = \\left( \\begin{array}{c}\n", "- A^{-1} B v_0 \\\\\n", "v_0\n", "\\end{array} \\right)\n", "\\end{align*}\n", "\n", "($\\Leftarrow$) If $S \\succeq 0$, then any $(u, v)$ gives\n", "\n", "\\begin{align*}\n", "\\left( \\begin{array}{l}\n", "u \\\\\n", "v\n", "\\end{array} \\right)^\\intercal X \\left( \\begin{array}{l}\n", "u \\\\\n", "v\n", "\\end{array} \\right) \\ge f_v = v^\\intercal S v \\ge 0 \\Rightarrow X \\succeq 0\n", "\\end{align*}\n", "\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, we assume\n", "\n", "\\begin{align}\n", "e^\\intercal_\\ell x + f_\\ell > 0 \\nonumber\n", "\\end{align}\n", "\n", "otherwise $C_\\ell x + d_\\ell = 0$ and SOCP degenerates to LP.\n", "\n", "We can write the constraint\n", "\n", "\\begin{align*}\n", "\\lVert C_\\ell x + d_\\ell \\rVert_2 \\le e^\\intercal_\\ell x + f_\\ell\n", "\\end{align*}\n", "\n", "as\n", "\n", "\\begin{align*}\n", "\\left( \\begin{array}{cc}\n", "(e^\\intercal_\\ell x + f_\\ell) \\texttt{I} & C_\\ell x + d_\\ell\\\\\n", "(C_\\ell x + d_\\ell)^\\intercal & e^\\intercal_\\ell x + f_\\ell\n", "\\end{array} \\right) \\succeq 0\n", "\\end{align*}\n", "\n", "Because\n", "\n", "\\begin{align*}\n", "\\left( \\begin{array}{cc}\n", "(e^\\intercal_\\ell x + f_\\ell) \\texttt{I} & C_\\ell x + d_\\ell\\\\\n", "(C_\\ell x + d_\\ell)^\\intercal & e^\\intercal_\\ell x + f_\\ell\n", "\\end{array} \\right) \\succeq 0 &\\Leftrightarrow (e^\\intercal_\\ell x + f_\\ell) - (C_\\ell x + d_\\ell)^\\intercal \\frac{1}{e^\\intercal_\\ell x + f_\\ell} (C_\\ell x + d_\\ell) \\ge 0 \\\\\n", "& \\Leftrightarrow \\lVert C_\\ell x + d_\\ell \\rVert_2^2 \\le (e^\\intercal_\\ell x + f_\\ell)^2 \\\\\n", "& \\Leftrightarrow \\lVert C_\\ell x + d_\\ell \\rVert_2 \\le e^\\intercal_\\ell x + f_\\ell,\n", "\\end{align*}\n", "\n", "where the last $\\Leftrightarrow$ holds because both terms are positive. Therefore, SOCP is a special case of SDP.\n", "\n", " $\\blacksquare$ " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Convex quadratic constrained quadratic programming (QCQP) is a special case of SOCP with $e_\\ell = 0$." ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/html": [ "\n", " show comment\n", " " ], "text/plain": [ "" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "hide_comment()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We show that QCQP is a special case of SOCP\n", "\n", "**Proof**:\n", "First, we show that we can reformulation Hyperbolic Constraints as Second-Order Cone Constraints as follows:\n", "\n", "\\begin{align*}\n", "u^\\intercal u \\le y z, y \\ge 0, z \\ge 0 \\Leftrightarrow \\left\\lVert \\left(\\begin{array}{c}\n", "2u \\\\\n", "y - z\n", "\\end{array} \\right) \\right\\rVert_2 \\le y+z, y \\ge 0, z \\ge 0\n", "\\end{align*}\n", "\n", "$\\Leftrightarrow$ holds because\n", "\n", "\\begin{align*}\n", "4 u^\\intercal u \\le 4 y z = (y+z)^2 - (y-z)^2 & \\Leftrightarrow 4 u^\\intercal u + (y-z)^2 \\le (y+z)^2 \\\\\n", "& \\Leftrightarrow \\sqrt{4 u^\\intercal u + (y-z)^2} \\le y+z \\\\\n", "& \\Leftrightarrow \\left\\lVert \\left(\\begin{array}{c}\n", "2u \\\\\n", "y - z\n", "\\end{array} \\right) \\right\\rVert_2 \\le y+z\n", "\\end{align*}\n", "\n", "Then, consider a QCQP\n", "\n", "\\begin{align*}\n", "\\min_x x^\\intercal Q x + c^\\intercal x: Ax = b,\\ x \\ge 0\n", "\\end{align*}\n", "\n", "is equivalent to\n", "\n", "\\begin{align*}\n", "\\min_x t + c^\\intercal x: Ax = b,\\ t \\ge x^\\intercal Q x,\\ x \\ge 0,\n", "\\end{align*}\n", "\n", "we only need to show that the quadratic constraint $t \\ge x^\\intercal Q x$ is a SOC constraint. Since $Q$ is positive semidefinite, Cholesky decomposition gives $Q = L L^\\intercal$. With $u = L^\\intercal x$\n", "\n", "\\begin{align*}\n", "t \\ge x^\\intercal Q x &\\Leftrightarrow t \\cdot 1 \\ge u^\\intercal u,\\ t \\ge 0 \\\\\n", "&\\Leftrightarrow \\left\\lVert \\left(\\begin{array}{c}\n", "2u \\\\\n", "t - 1\n", "\\end{array} \\right) \\right\\rVert_2 \\le t+1, t \\ge 0\n", "\\end{align*}\n", "\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "However, SOCP **cannot**, in general, be cast as QCQP. One may be tempted to square the SCO constraints and obtain a quadratic constraint of the form\n", "\n", "\\begin{align*}\n", "\\lVert C_\\ell x + d_\\ell \\rVert_2^2 \\le (e^\\intercal_\\ell x + f_\\ell)^2, \\ e^\\intercal_\\ell x + f_\\ell \\ge 0.\n", "\\end{align*}\n", "\n", "While the constraints are equivalent to the original SOC constraint, the first may not be convex. Consider the quadratic terms in each side of the inequality. We have\n", "\n", "\\begin{align*}\n", "x^\\intercal C^\\intercal_\\ell C_\\ell x + \\cdots \\le x^\\intercal e_\\ell e_\\ell^\\intercal x + \\cdots\n", "\\end{align*}\n", "\n", "It is convex only if $C^\\intercal_\\ell C_\\ell - e_\\ell e_\\ell^\\intercal \\succeq 0$.\n", "\n", " $\\blacksquare$ " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Duality" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Semi-definite Programming (SDP)\n", "\n", "Consider an SDP\n", "\\begin{align}\n", "p^* = \\min_{X} \\langle C, X\\rangle : \\langle A^\\ell, ~X\\rangle = b_\\ell, ~\\ell=1, \\ldots, L, ~X \\succeq 0 \\nonumber\n", "\\end{align}\n", "\n", "The dual problem is also an SDP as follows\n", "\\begin{align}\n", "d^\\ast = \\max_{\\nu} \\nu^\\intercal b: \\sum_{\\ell=1}^{L} \\nu_\\ell A^\\ell \\preceq C \\nonumber\n", "\\end{align}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Lagrangian\n", "\n", "\\begin{align}\n", "\\mathcal{L}(X, \\lambda, \\nu) & = \\langle C,X\\rangle + \\sum_{\\ell=1}^{L} \\nu_\\ell (b_\\ell - \\langle A^\\ell, X\\rangle) - \\lambda \\cdot \\lambda_{\\min} (X) \\nonumber \\\\\n", "& = \\nu^\\intercal b + \\left\\langle C - \\sum_{\\ell=1}^{L} \\nu_\\ell A^\\ell, X \\right\\rangle - \\lambda \\cdot \\lambda_{\\min} (X) \\nonumber\n", "\\end{align}\n", "\n", "where $\\nu \\in \\mathbb{R}^m$ and $\\lambda \\ge 0$ are the dual variables, $\\lambda_{\\min}(X)$ is the smallest eigenvalue of $X$. We have used $\\lambda_{\\min}(X) \\ge 0$ is equivalent to $X \\succeq 0$.\n", "\n", "\\begin{align*}\n", "p^\\ast & = \\min_{X} \\max_{\\nu, \\lambda \\ge 0} \\mathcal{L}(X, \\lambda, \\nu)\\\\\n", "& \\ge \\max_{\\nu, \\lambda \\ge 0} \\min_{X} \\mathcal{L}(X, \\lambda, \\nu)\\\\\n", "& = d^\\ast\n", "\\end{align*}" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", " show comment\n", " " ], "text/plain": [ "" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "hide_comment()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We show the min-max and max-min relationship\n", "\n", "**Proof**\n", "\n", "For any $X^0, \\lambda^0, \\nu^0$,\n", "\n", "\\begin{align*}\n", "&&\\ \\mathcal{L}(X^0, \\lambda^0, \\nu^0) &\\ge \\min_{X} \\mathcal{L}(X, \\lambda^0, \\nu^0) \\\\\n", "\\Rightarrow &&\\ \\max_{\\nu, \\lambda \\ge 0} \\mathcal{L}(X^0, \\lambda, \\nu) &\\ge \\max_{\\nu, \\lambda \\ge 0} \\min_{X} \\mathcal{L}(X, \\lambda, \\nu) \\\\\n", "\\Rightarrow &&\\ \\min_{X} \\max_{\\nu, \\lambda \\ge 0} \\mathcal{L}(X, \\lambda, \\nu) &\\ge \\max_{\\nu, \\lambda \\ge 0} \\min_{X} \\mathcal{L}(X, \\lambda, \\nu) \\\\\n", "\\end{align*}\n", "\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We show that $\\lambda_{\\min}(\\cdot)$ can be represented as an optimization problem.\n", "\n", "**Lemma**: Let $A$ be a symmetric matrix and $\\lambda_{\\min}(A)$ is the smallest eigenvalue of $A$. We have\n", "\n", "\\begin{align}\n", "\\lambda_{\\min}(A) = \\min_{Y} \\langle Y, A\\rangle: Y \\succeq 0,\\ \\textbf{Tr}Y = 1 \\nonumber\n", "\\end{align}\n", "\n", "**Proof**\n", "$A = P \\Lambda P^\\intercal$ where $P$ is orthogonal matrix and $\\Lambda = \\text{diag}(\\lambda_1,\\ldots,\\lambda_n)$\n", "\n", "\\begin{align*}\n", "\\text{Tr}(YA) = \\text{Tr}(Y P \\Lambda P^\\intercal) = \\text{Tr}(P^\\intercal Y P \\Lambda) = \\text{Tr}(Y^{\\prime} \\Lambda )\n", "\\end{align*}\n", "\n", "Since $Y^{\\prime} \\succeq 0$ and $\\textbf{Tr}Y^{\\prime} = 1$, we can assume that $A$ is diagonal without loss of generality.\n", "\n", "When $A$ is diagonal, $(YA)_{ii} = Y_{ii} A_{ii}$ and $\\text{Tr}(YA) = \\sum_{i} Y_{ii} A_{ii}$. Then, we can restrict $Y$ to be diagonal as well. So\n", "\n", "\\begin{align*}\n", "\\min_{Y} \\langle Y, A\\rangle: Y \\succeq 0,\\ \\textbf{Tr}Y = 1 \\Leftrightarrow \\min_{Y} \\sum_{i} Y_i \\lambda_i : Y_i \\ge 0,\\ \\sum_{i} Y_i = 1,\n", "\\end{align*}\n", "\n", "which is $\\lambda_{\\min}(A)$. The lemma is proved.\n", "\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "\n", "**Lemma**:\n", "\n", "\\begin{align*}\n", "p^\\ast = \\min_{X} \\max_{\\nu, \\lambda \\ge 0} \\mathcal{L}(X, \\lambda, \\nu)\n", "\\end{align*}\n", "\n", "**Proof**:\n", "\n", "\\begin{align}\n", "\\min_{X} \\max_{\\nu, \\lambda \\ge 0} \\mathcal{L}(X, \\lambda, \\nu)& = \\min_{X} \\max_{\\nu, \\lambda \\ge 0} \\langle C,X\\rangle + \\sum_{\\ell=1}^{L} \\nu_\\ell (b_\\ell - \\langle A^\\ell, X\\rangle) - \\lambda \\cdot \\lambda_{\\min} (X) \\nonumber \\\\\n", "&= \\min_{X} \\left\\{ \\begin{aligned}\n", "&\\langle C,X\\rangle + \\sum_{\\ell=1}^{L} \\nu_\\ell (b_\\ell - \\langle A^\\ell, X\\rangle) && \\text{if } \\lambda_{\\min} (X) \\ge 0 \\Rightarrow X \\succeq 0 \\\\\n", "& \\infty && \\text{otherwise}\n", "\\end{aligned} \\right. \\nonumber \\\\\n", "&=\\min_{X \\succeq 0} \\max_{\\nu} \\langle C,X\\rangle + \\sum_{\\ell=1}^{L} \\nu_\\ell (b_\\ell - \\langle A^\\ell, X\\rangle) \\nonumber \\\\\n", "&= \\min_{X \\succeq 0} \\left\\{\\begin{aligned}\n", "&\\langle C,X\\rangle && \\text{if } \\langle A^\\ell, X\\rangle = b_\\ell, ~\\ell=1, \\ldots, L \\\\\n", "& -\\infty && \\text{otherwise}\n", "\\end{aligned} \\right. \\nonumber \\\\\n", "&= \\min_{X \\succeq 0} \\langle C, X\\rangle : \\langle A^\\ell, ~X\\rangle = b_\\ell, ~\\ell=1, \\ldots, \\ell \\nonumber \\\\\n", "&= p^\\ast \\nonumber \n", "\\end{align}\n", "\n", "The fourth equality holds because an optimal solution may be $v_i = 0$ if $b_i - \\langle A_i, X\\rangle = 0$ and $v_i = -t \\cdot \\text{sign}(b_i - \\langle A_i, X\\rangle)$ otherwise with $t \\to \\infty$.\n", "\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "\n", "**Lemma**:\n", "\n", "\\begin{align*}\n", "d^\\ast=\\max_{\\nu, \\lambda \\ge 0} \\min_{X} \\mathcal{L}(X, \\lambda, \\nu)\n", "\\end{align*}\n", "\n", "
\n", "\n", "**Proof**:\n", "\n", "For notational convenience, we let\n", "\n", "\\begin{align}\n", "Z &= C - \\sum_{\\ell=1}^{L} \\nu_\\ell A^\\ell \\nonumber \\\\\n", "G(\\lambda, Z) &= \\min_{X} ( \\langle Z, X \\rangle - \\lambda \\cdot \\lambda_{\\min} (X) ) \\nonumber\n", "\\end{align}\n", "\n", "where $G(\\lambda, \\cdot)$ is the conjugate of the concave function $\\lambda \\cdot \\lambda_{\\min} (X)$.\n", "\n", "The conjugate $f^\\ast$ of a function $f$ is given by\n", "\\begin{align}\n", "f^\\ast(y) = \\sup_{x \\in \\textbf{dom} f} (y^\\intercal x - f(x)) \\nonumber\n", "\\end{align}\n", "\n", "and \n", "\n", "\\begin{align*}\n", "G(\\lambda, \\cdot) &= \\max_{X} ( \\langle \\cdot, -X \\rangle + \\lambda \\cdot \\lambda_{\\min} (X) )\n", "\\end{align*}\n", "\n", "Suppose we can show that\n", "\n", "\\begin{align}\n", "G(\\lambda, Z) = \\left\\{ \\begin{aligned}\n", "&0 && \\textrm{if } \\textrm{Tr}(Z) = \\lambda \\ge 0, Z \\succeq 0, \\text{ i.e., } Z \\in \\mathcal{Z}(\\lambda) \\\\\n", "&-\\infty && \\textrm{otherwise}\n", "\\end{aligned} \\right. \\nonumber\n", "\\end{align}\n", "\n", "where $\\mathcal{Z}(\\lambda) = \\{Z: \\textrm{Tr}(Z) = \\lambda \\ge 0, Z \\succeq 0\\}$. Then,\n", "\n", "\\begin{align*}\n", "\\max_{\\nu, \\lambda \\ge 0} \\min_{X} \\mathcal{L}(X, \\lambda, \\nu) &= \\max_{\\nu, \\lambda \\ge 0} \\nu^\\intercal b + G(\\lambda, Z) \\\\\n", "&= \\max_{\\nu, \\lambda \\ge 0} \\nu^\\intercal b: Z = C - \\sum_{i=1}^{m} \\nu_i A_i \\succeq 0, \\textrm{Tr}(Z) = \\lambda \\ge 0 \\\\\n", "&= \\max_{\\nu} \\nu^\\intercal b: C - \\sum_{i=1}^{m} \\nu_i A_i \\succeq 0 \\\\\n", "&= d^*\n", "\\end{align*}\n", "\n", "where $\\lambda$ can be eliminated because $Z \\succeq 0$ implies $\\textrm{Tr}(Z) \\ge 0$.\n", "\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, we will complete the proof by showing that\n", "\n", "\\begin{align}\n", "G(\\lambda, Z) = \\left\\{ \\begin{aligned}\n", "&0 && \\textrm{if } \\textrm{Tr}(Z) = \\lambda \\ge 0, Z \\succeq 0, \\text{ i.e., } Z \\in \\mathcal{Z}(\\lambda) \\\\\n", "&-\\infty && \\textrm{otherwise}\n", "\\end{aligned} \\right. \\label{eqn:g_value}\n", "\\end{align}\n", "\n", "Because $\\lambda_{\\min}(\\cdot)$ can be represented as an optimization problem, we have\n", "\n", "\\begin{align}\n", "G(\\lambda, Z) & = \\min_{X} \\left( \\langle Z, X \\rangle - \\lambda \\min_{Y \\succeq 0, \\textbf{Tr} Y = 1} \\langle Y, X \\rangle \\right) = \\min_{X} \\max_{Y \\succeq 0, \\textbf{Tr} Y = 1} \\langle Z - \\lambda Y, X \\rangle \\nonumber \\\\\n", "& = \\min_{X} \\max_{Y \\succeq 0, \\textbf{Tr} Y = \\lambda} \\langle Z - Y, X \\rangle \\nonumber \\\\\n", "& \\ge \\max_{Y \\succeq 0, \\textbf{Tr} Y = \\lambda} \\min_{X \\succeq 0} \\langle Z - Y, X \\rangle \\nonumber \\\\\n", "& = \\max_{Y \\succeq 0, \\textbf{Tr} Y = \\lambda} \\left\\{ \\begin{aligned}\n", "&0 && \\text{if } Z = Y, \\\\\n", "&-\\infty && \\text{otherwise}\n", "\\end{aligned} \\right. \\nonumber \\\\\n", "& = \\left\\{ \\begin{aligned}\n", "&0 && \\text{if } \\textrm{Tr}(Z) = \\lambda \\ge 0, Z \\succeq 0, \\\\\n", "&-\\infty && \\text{otherwise}\n", "\\end{aligned} \\right. \\label{eqn:g}\n", "\\end{align}\n", " \n", "We prove \\eqref{eqn:g_value} in two steps:\n", "\n", "1. **Proof of $G(\\lambda, Z) = 0$ if $Z \\in \\mathcal{Z}(\\lambda)$**\n", "\n", "Following the definition of $G(\\lambda, Z)$, $G(\\lambda, Z) \\le 0$ since $X = 0$ is a feasible solution. We also have $G(\\lambda, Z) \\ge 0$ because of \\eqref{eqn:g}. So $G(\\lambda, Z) = 0$ if $Z \\in \\mathcal{Z}(\\lambda)$.\n", "\n", "2. **Proof of $G(\\lambda, Z) = -\\infty$ if $Z \\notin \\mathcal{Z}(\\lambda)$**\n", "\n", "If $\\textrm{Tr}(Z) \\neq \\lambda$, choosing $X = \\epsilon t \\texttt{I}$ with $\\epsilon = -\\text{sign}(\\textrm{Tr}(Z) - \\lambda)$ and $t \\to +\\infty$ implies\n", "\n", "\\begin{align*}\n", "G(\\lambda, Z) &\\le \\langle Z, \\epsilon t \\texttt{I} \\rangle - \\lambda \\cdot \\lambda_{\\min} (\\epsilon t \\texttt{I}) \\\\\n", "& = -t |\\textrm{Tr}(Z) + \\lambda| \\to \\infty\n", "\\end{align*}\n", "\n", "If $\\textrm{Tr}(Z) = \\lambda$, but $Z \\not\\succeq 0$ (i.e., $\\lambda_{\\max} (Z) < 0$), let $X = t u u^\\intercal$, where $u$ is an unit-norm ($\\lVert u^\\intercal u \\rVert = 1$) eigenvector corresponding to $\\lambda_{\\max} (Z)$ and $t \\to +\\infty$. We have\n", "\n", "\\begin{align}\n", "G(\\lambda, Z) &= \\min_{X} ( \\langle Z, X \\rangle - \\lambda \\cdot \\lambda_{\\min} (X) ) \\le \\langle Z, t u u^\\intercal \\rangle - \\lambda \\cdot \\lambda_{\\min} (t u u^\\intercal) \\nonumber \\\\\n", "&= t \\textrm{Tr} (Z^\\intercal uu^\\intercal) = t \\lambda_{\\max}(Z) \\to -\\infty \\nonumber\n", "\\end{align}\n", "\n", "because\n", "\n", "\\begin{align*}\n", "\\textrm{Tr} (Z^\\intercal uu^\\intercal) &= \\textrm{Tr} (uu^\\intercal Z^\\intercal ) = \\textrm{Tr} (u (Zu)^\\intercal) = \\textrm{Tr} (u \\lambda_{\\max}(Z) u^\\intercal) \\\\\n", "& = \\lambda_{\\max}(Z) \\textrm{Tr} (u u^\\intercal) = \\lambda_{\\max}(Z) \\textrm{Tr} (u^\\intercal u) = \\lambda_{\\max}(Z)\n", "\\end{align*}\n", "\n", "and $\\lambda_{\\min} (u u^\\intercal) = 0$ because $u u^\\intercal$ is rank one matrix.\n", "\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Lemma**: $\\lambda_{\\min}$ is concave, i.e., $\\lambda_{\\min}\\left(\\frac{1}{2}A_1 + \\frac{1}{2}A_2\\right) \\ge \\frac{1}{2}\\lambda_{\\min}(A_1) + \\frac{1}{2}\\lambda_{\\min}(A_2)$\n", "\n", "**Proof**:\n", "Let $\\mathcal{Y} = \\{Y: Y \\succeq 0, \\textbf{Tr}Y = 1 \\}$. We have\n", "\\begin{align}\n", "\\min_{Y \\in \\mathcal{Y}} \\left\\langle Y, \\frac{1}{2}A_1 + \\frac{1}{2}A_2 \\right\\rangle = \\min_{Y \\in \\mathcal{Y}} \\left( \\frac{1}{2} \\left\\langle Y, A_1 \\right\\rangle + \\frac{1}{2} \\left\\langle Y, A_2 \\right\\rangle \\right) \\nonumber\n", "\\end{align}\n", "\n", "Since, for any $Y \\in \\mathcal{Y}$, \n", "\\begin{align}\n", "&& \\frac{1}{2} \\left\\langle Y, A_1 \\right\\rangle + \\frac{1}{2} \\left\\langle Y, A_2 \\right\\rangle &\\ge \\frac{1}{2} \\min_{Y \\in \\mathcal{Y}} \\left\\langle Y, A_1 \\right\\rangle + \\frac{1}{2} \\min_{Y \\in \\mathcal{Y}} \\left\\langle Y, A_2 \\right\\rangle \\nonumber \\\\\n", "\\Rightarrow && \\min_{Y \\in \\mathcal{Y}} \\left( \\frac{1}{2} \\left\\langle Y, A_1 \\right\\rangle + \\frac{1}{2} \\left\\langle Y, A_2 \\right\\rangle \\right) & \\ge \\frac{1}{2} \\min_{Y \\in \\mathcal{Y}} \\left\\langle Y, A_1 \\right\\rangle + \\frac{1}{2} \\min_{Y \\in \\mathcal{Y}} \\left\\langle Y, A_2 \\right\\rangle \\nonumber\n", "\\end{align}\n", "\n", " $\\blacksquare$ " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Conic Lagrangian\n", "\n", "\\begin{align}\n", "\\mathcal{L}^c(X, \\nu, Y) & = \\langle C,X\\rangle + \\sum_{\\ell=1}^{L} \\nu_\\ell (b_\\ell - \\langle A^\\ell, X\\rangle) - \\langle Y,X\\rangle \\nonumber \\\\\n", "& = \\nu^\\intercal b + \\left\\langle C - \\sum_{\\ell=1}^{L} \\nu_\\ell A^\\ell, X \\right\\rangle - \\langle Y,X\\rangle \\nonumber\n", "\\end{align}\n", "\n", "where now we associate a matrix dual variable $Y \\succeq 0$ to the constraint $X \\succeq 0$.\n", "\n", "\\begin{align*}\n", "p^\\ast &= \\min_{X \\succeq 0} \\max_{\\nu, Y \\succeq 0} \\mathcal{L}^c(X, \\nu, Y) \\\\\n", "& \\ge \\max_{\\nu, Y \\succeq 0} \\min_{X \\succeq 0} \\mathcal{L}^c(X, \\nu, Y) \\\\\n", "& = d^\\ast \n", "\\end{align*}" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/html": [ "\n", " show comment\n", " " ], "text/plain": [ "" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "hide_comment()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Lemma**\n", "\n", "\\begin{align*}\n", "p^\\ast = \\min_{X} \\max_{\\nu, Y \\succeq 0} \\mathcal{L}^c(X, \\nu, Y)\n", "\\end{align*}\n", "\n", "**Proof**\n", "Recall the lemma that $\\lambda_{\\min}(\\cdot)$ is represented as an optimization problem.\n", "\n", "We have\n", "\\begin{align}\n", "\\min_{Y \\succeq 0} \\langle Y, X\\rangle = \\min_{t \\ge 0} \\min_{Y \\succeq 0, \\textbf{Tr} Y = t} \\langle Y, X\\rangle = \\min_{t \\ge 0} t \\lambda_{\\min}(X) = \n", "\\left\\{\n", "\\begin{aligned}\n", "&0\t\t \t && \\text{if } X \\succeq 0 \\\\\n", "&-\\infty && \\text{otherwise.}\n", "\\end{aligned}\n", "\\right. \\nonumber\n", "\\end{align}\n", "\n", "Therefore, we get\n", "\\begin{align}\n", "\\min_{X} \\max_{\\nu, Y \\succeq 0} \\mathcal{L}^c(X, \\nu, Y) &= \\min_{X} \\max_{\\nu, Y \\succeq 0} \\left( \\langle C,X\\rangle + \\sum_{\\ell=1}^{L} \\nu_\\ell (b_\\ell - \\langle A^\\ell, X\\rangle) - \\langle Y,X\\rangle \\right) \\nonumber \\\\\n", "& = \\min_{X \\succeq 0} \\max_{\\nu} \\left( \\langle C,X\\rangle + \\sum_{\\ell=1}^{L} \\nu_\\ell (b_\\ell - \\langle A^\\ell, X\\rangle) \\right) \\nonumber \\\\\n", "&= \\min_{X} \\left\\{\\begin{aligned}\n", "&\\langle C,X\\rangle && \\text{if } \\langle A^\\ell, X\\rangle = b_\\ell, ~\\ell=1, \\ldots, L \\\\\n", "& -\\infty && otherwise\n", "\\end{aligned} \\right. \\nonumber \\\\\n", "&= \\min_{X} \\langle C, X\\rangle : \\langle A^\\ell, ~X\\rangle = b_\\ell, ~\\ell=1, \\ldots, \\ell \\nonumber \\\\\n", "&= p^\\ast \\nonumber \n", "\\end{align}\n", "\n", "\n", "\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Lemma**\n", "\n", "\\begin{align*}\n", "d^\\ast = \\max_{\\nu, Y \\succeq 0} \\min_{X} \\mathcal{L}^c(X, \\nu, Y)\n", "\\end{align*}\n", "\n", "**Proof**\n", "\n", "\\begin{align}\n", "\\min_{\\nu, Y \\succeq 0} \\max_{X} \\mathcal{L}^c(X, \\nu, Y) &= \\min_{\\nu, Y \\succeq 0} \\max_{X} \\left( \\nu^\\intercal b + \\left\\langle C - \\sum_{\\ell=1}^{L} \\nu_\\ell A^\\ell, X \\right\\rangle - \\langle Y,X\\rangle \\right) \\nonumber \\\\\n", "& = \\min_{\\nu} \\nu^\\intercal b + \\min_{\\nu, Y \\succeq 0} \\max_{X} \\langle Z-Y, X \\rangle \\nonumber \\\\\n", "& = \\min_{\\nu} \\nu^\\intercal b + \\left\\{\n", "\\begin{aligned}\n", "&0\t\t\t && \\text{if } Z \\succeq 0, \\\\\n", "&+\\infty && \\text{otherwise.}\n", "\\end{aligned}\n", "\\right. \\nonumber \\\\\n", "&= \\min_{\\lambda, \\nu, Z} \\nu^\\intercal b: Z = C - \\sum_{i=1}^{m} \\nu_i A_i \\succeq 0 \\nonumber\\\\\n", "&= d^\\ast. \\nonumber\n", "\\end{align}\n", "\n", " $\\blacksquare$ " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Strong Duality\n", "\n", "Strong duality can fail. Consider the example\n", "\\begin{align}\n", "p^\\ast = \\min_{x} x_2 : \\left( \n", "\\begin{array}{ccc}\n", "x_2 + 1 & 0 & 0 \\\\\n", "0 & x_1 & x_2 \\\\\n", "0 & x_2 & 0\n", "\\end{array}\n", "\\right) \\succeq 0 \\nonumber\n", "\\end{align}\n", "\n", "positive-semi definiteness implies $x_1 \\le 0$ and $x^2_2 \\le 0$. So, $p^\\ast = 0$.\n", "\n", "The dual is\n", "\\begin{align}\n", "d^\\ast = \\max_{Y \\in S^3} - Y_{11}: Y \\succeq 0, Y_{22}=0, 1-Y_{11}- 2Y_{23} = 0 \\nonumber\n", "\\end{align}\n", "\n", "Since $Y \\succeq 0, Y_{22}=0$, $Y_{23}= 0$. Hence, $d^\\ast = - Y_{11} = -1 < p^\\ast$.\n", "\n", "**Strong duality holds** if Slater's condition holds for the primal problem where Slater's condition states that the feasible region must have an interior point." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Conic Programming\n", "\n", "Consider a conic programming problem\n", "\n", "\\begin{align}\n", "p^\\ast = \\min_{x \\in \\Re^n} \\langle c, x \\rangle : Ax = b, ~ x \\in K, \\nonumber\n", "\\end{align}\n", "\n", "where $K \\subseteq \\Re^n$ is a convex cone.\n", "\n", "The dual problem is\n", "\n", "\\begin{align}\n", "\\max_{y \\in \\Re^m} \\langle b, y \\rangle : A^\\intercal y + s = c, ~ s \\in K^\\ast, \\nonumber\n", "\\end{align}\n", "\n", "where $K^\\ast$ is the dual cone defined by\n", "\n", "\\begin{align*}\n", "K^\\ast = \\{s \\in \\Re^n: \\langle s,x\\rangle \\ge 0,~ \\forall x \\in K\\}.\n", "\\end{align*}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The Lagrange dual problem is\n", "\n", "\\begin{align}\n", "\\inf_{x \\in K} \\mathcal{L}(y) &= \\inf_{x \\in K} \\langle c, x \\rangle + \\langle y, b-Ax \\rangle \\nonumber \\\\\n", "& = \\langle y, b \\rangle + \\inf_{x \\in K} \\langle c, x \\rangle - \\langle y, Ax \\rangle = \\langle y, b \\rangle + \\inf_{x \\in K} \\langle c-A^\\intercal y, x \\rangle \\nonumber \n", "\\end{align}\n", "\n", "- If $c-A^\\intercal y \\in K^\\ast$, then $\\langle c-A^Ty, x \\rangle$, so $\\inf_{x \\in K} \\langle c-A^\\intercal y, x \\rangle = 0$.\n", "\n", "- If $c-A^\\intercal y \\notin K^\\ast$, then $\\inf_{x \\in K} \\langle c-A^\\intercal y, x \\rangle = -\\infty$.\n", "\n", "\n", "Thus\n", "\n", "\\begin{align}\n", "\\inf_{x \\in K} \\mathcal{L}(y) = \\left\\{ \\begin{aligned}\n", "&\\langle y, b \\rangle && c-A^\\intercal y \\in K^\\ast \\\\\n", "&-\\infty && c-A^\\intercal y \\notin K^\\ast\n", "\\end{aligned}\\right. \\nonumber\n", "\\end{align}\n", "\n", "Let $s = c-A^Ty \\in K^\\ast$. Then we obtain the dual problem." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Applications" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Eigenvalue Problem\n", "\n", "For a matrix $A \\in S^n_+$ (the set of positive semi-definite matrices), we consider\n", "\n", "\\begin{align}\n", "p^\\ast = \\max_X \\langle A, X\\rangle: \\textrm{Tr}(X) = 1, ~ X \\succeq 0. \\nonumber\n", "\\end{align}\n", "\n", "The associated Lagrangian, using the conic approach, is\n", "\n", "\\begin{align}\n", "\\mathcal{L}(X, Y, \\nu) = \\langle A, X\\rangle + \\nu (1 - \\textrm{Tr}(X)) + \\langle Y, X\\rangle \\nonumber\n", "\\end{align}\n", "\n", "with the matrix dual variable $Y \\succeq 0$, and $\\nu \\in \\mathbb{R}$ is free. The dual problem is\n", "\n", "\\begin{align}\n", "d^\\ast = \\min_{\\nu, Y} : Y + A = \\lambda I, ~ Y \\succeq 0. \\nonumber\n", "\\end{align}\n", "\n", "Both the primal and dual problems are strictly feasible, so $p^\\ast = d^\\ast$." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Robust Linear Programming\n", "\n", "\\begin{align*}\n", "\\min \\ c^\\intercal x: \\text{Prob}(a_i^\\intercal x \\le b_i) \\ge \\eta, \\ i=1,\\ldots, n\n", "\\end{align*}\n", "\n", "where coefficient vectors are IID $a_i \\sim \\mathcal{N}(\\bar{a}_i, \\Sigma_i)$. For fixed $x$, $a^\\intercal_i x$ is $\\mathcal{N}(\\bar{a}^\\intercal_i x, x^\\intercal \\Sigma_i x)$. We have\n", "\n", "\\begin{align*}\n", "\\text{Prob}(\\bar{a}^\\intercal_i x + x^\\intercal \\Sigma_i x Z \\le b) & \\ge \\eta \\Leftrightarrow \\text{Prob}\\left(Z \\le \\frac{b - \\bar{a}^\\intercal_i x}{x^\\intercal \\Sigma_i x }\\right) \\ge \\eta \\\\\n", "\\Phi^{-1}(\\eta) & \\le \\frac{b - \\bar{a}^\\intercal_i x}{x^\\intercal \\Sigma_i x } \\Leftrightarrow \\bar{a}^\\intercal_i x + \\Phi^{-1}(\\eta) \\lVert \\Sigma_i^{1/2} x \\rVert_2 \\le b\n", "\\end{align*}\n", "\n", "it is equivalent to\n", "\n", "\\begin{align*}\n", "\\min \\ & c^\\intercal x \\\\\n", "\\text{s.t. } & \\bar{a}^\\intercal_i x + \\Phi^{-1}(\\eta) \\lVert \\Sigma_i^{1/2} x \\rVert_2 \\le b, \\ i=1,\\ldots, n\n", "\\end{align*}\n", "\n", "\n", "- robust LP is an SOCP for $\\eta \\ge 0.5$ because $\\Phi^{-1}(\\eta) \\ge 0$\n", "\n", "- robust LP is a LP for $\\eta = 0.5$ because $\\Phi^{-1}(\\eta) = 0$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Distributionally Robust Optimization\n", "\n", "\\begin{align*}\n", "\\max_{F} \\ & E_{F}(h(x, \\xi)) \\\\\n", "\\text{s.t.} \\ & \\int_{\\mathcal{S}} d F(\\xi) = 1 \\\\\n", "& (\\mathbb{E}(\\xi) - \\mu)^\\intercal \\Sigma^{-1} (\\mathbb{E}(\\xi) - \\mu) \\le \\gamma_1 \\\\\n", "& \\mathbb{E}((\\xi-\\mu)(\\xi-\\mu)^\\intercal) \\preceq \\gamma_2 \\Sigma\n", "\\end{align*}\n", "\n", "is equivalent to an SDP\n", "\n", "\\begin{align*}\n", "\\max_{F} \\ & \\int_{\\mathcal{S}} h(x, \\xi) d F(\\xi) \\\\\n", "\\text{s.t.} \\ & \\int_{\\mathcal{S}} d F(\\xi) = 1 \\\\\n", "& \\int_{\\mathcal{S}} \\left[\n", "\\begin{array}{cc}\n", "\\Sigma & \\xi-\\mu \\\\\n", "\\xi-\\mu & \\gamma_1\n", "\\end{array}\n", "\\right] d F(\\xi) \\succeq 0 \\quad \\text{(Schur complement)}\\\\\n", "& \\int_{\\mathcal{S}} (\\xi-\\mu)(\\xi-\\mu)^\\intercal d F(\\xi) \\preceq \\gamma_2 \\Sigma\n", "\\end{align*}\n", "\n", "We refer to (Delage and Ye 2010) for its dual formulation." ] } ], "metadata": { "celltoolbar": "Raw Cell Format", "hide_input": false, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.3" }, "latex_envs": { "LaTeX_envs_menu_present": true, "autoclose": true, "autocomplete": true, "bibliofile": "biblio.bib", "cite_by": "apalike", "current_citInitial": 1, "eqLabelWithNumbers": true, "eqNumInitial": 1, "hotkeys": { "equation": "Ctrl-E", "itemize": "Ctrl-I" }, "labels_anchors": false, "latex_user_defs": false, "report_style_numbering": false, "user_envs_cfg": true }, "toc": { "base_numbering": 1, "nav_menu": {}, "number_sections": true, "sideBar": true, "skip_h1_title": false, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": false, "toc_position": {}, "toc_section_display": true, "toc_window_display": true }, "widgets": { "application/vnd.jupyter.widget-state+json": { "state": {}, "version_major": 2, "version_minor": 0 } } }, "nbformat": 4, "nbformat_minor": 2 }