We can define a general optimization problem as follows:
\begin{array}{lll} \min & f(x) & \\ \text{subject to} & g_i(x) \geq 0, & i=1, \dots, m, \end{array}where $f:\Re^n \to \Re$ and $g_i:\Re^n \to \Re$ for $i=1,\ldots,m$. We call $f$ as the objective function and $g_i$ as the constraint functions. Using these constraints, we have the feasible region given by \begin{align*} \mathscr{F} = \{x \in \Re^n : g_i(x) \geq 0, i=1, \dots, m\}. \end{align*}
Linear Programming¶
The llinear programming (LP) is an optimization problem that the objective function and the constraints are all linear functions. Therefore, we can write a linear programming problem by using vectors and matrices \begin{align*} p^* = \max_\mathbf{x} \ & \mathbf{c}^\intercal \mathbf{x} \\ \text{subject to } & A\mathbf{x} = \mathbf{b} \\ & \mathbf{x} \ge 0 \end{align*}
where $A \in \Re^{m \times n}$, $\mathbf{b} \in \Re^{m}$, $\mathbf{c} \in \Re^{n}$ and $\mathbf{x} \in \Re^{n}$. The mathematical model above is also known as a linear program in standard form. Note that in this form, we have only equalities and the nonnegativity constraints. This is not a restriction as it is fairly easy to convert any linear optimization problem into the standard form.
A simple LP example is as follows: \begin{align*} \begin{array}{lrlrlrl} \max & 80x_1 & + & 129x_2 & & & \\ \text{ s.t.} & 5x_1 & + & 6x_2 & & & \le 10000 \\ & x_1 & + & 2x_2 & & & \le 3000 \\ & x_1 & & & & & \le 600 \\ & & & x_2 & & & \le 1200 \\ & \mathbf{x} & \ge 0 &&& \end{array} \end{align*}
The LP can be rewritten as follows: \begin{align*} \begin{array}{lrlrlrlrlrlrl} \max & 80x_1 & + & 129x_2 & \\ \text{ s.t.} & 5x_1 & + & 6x_2 & + & x_3 & & & & & & & = 10000 \\ & x_1 & + & x_2 & & & + & x_4 & & & & & = 3000 \\ & x_1 & & & & & & & + & x_5 & & & = 600 \\ & & & x_2 & & & & & & & + & x_6 & = 1200 \\ & \mathbf{x} & \ge 0 &&&&&&& \end{array} \end{align*} Hence, we can assume $A \mathbf{x} = \mathbf{b}$ without loss of generality.
Duality¶
For each LP problem, there is a unique matching LP model known as the dual problem. We have primal and dual problems as follows.
\begin{align*} p^* = \max_\mathbf{x} \ & \mathbf{c}^\intercal \mathbf{x} &&&&& d^* = \min \ & \mathbf{y}^\intercal \mathbf{b} \\ \text{s.t. } & A\mathbf{x} = \mathbf{b} \qquad \text{(Primal)}&&&&& \text{s.t. } & \mathbf{y}^\intercal A \ge \mathbf{c}^\intercal \qquad \text{(Dual)}\\ & \mathbf{x} \ge 0 \end{align*}The duality theory of LP states that if both primal and dual problems are feasible, then $p^* = d^*$. We introduce Farkas' lemma before proving the duality theory.
¶
Farkas' Lemma¶
For $A \in \Re^{m \times n}$ and $\mathbf{b} \in \Re^{m}$, there exactly one of the following two systems holds:
\begin{align*} \text{(I)} \quad \exists \mathbf{x} \in \Re^n & &&&&& \text{(II)} \quad \exists \mathbf{y} \in \Re^m &\\ \text{s.t. } A\mathbf{x} &= \mathbf{b} &&&&& \text{s.t. } A^\intercal \mathbf{y} & \ge 0 \\ \mathbf{x} &\ge 0 &&&&& \mathbf{y}^\intercal \mathbf{b} & < 0 \end{align*}Farkas' lemma has an Alternative statement that
For $A \in \Re^{m \times n}$ and $\mathbf{b} \in \Re^{m}$, there exactly one of the following two systems holds:
\begin{align*} \text{(I)} \quad \exists \mathbf{x} \in \Re^n & &&&&& \text{(II)} \quad \exists \mathbf{y} \in \Re^m & \\ \text{s.t. } A\mathbf{x} & \le b &&&&& \text{s.t. } A^\intercal \mathbf{y} & \ge 0\\ \mathbf{x} &\ge 0 &&&&& y^\intercal \mathbf{b} & < 0 \\ &&&&&& \mathbf{y} & \ge 0 \end{align*}Proof¶
Proof of not both
Suppose $\mathbf{x}$ satisfies $\text{(I)}$ and $\mathbf{y}$ satisfies $\text{(II)}$. Then $0 > \mathbf{y}^\intercal \mathbf{b} = \mathbf{y}^\intercal A \mathbf{x} \ge 0$, a contradiction.
Proof of at least one
Suppose $\text{(I)}$ infeasible. We will show $\text{(II)}$ feasible.
- Consider $S = \{ A \mathbf{x} : \mathbf{x} \ge 0 \}$ so that $S$ is closed, convex and $\mathbf{b} \notin S$.
- Let $\mathbf{y} \in \Re^m$ and $\alpha \in \Re$ define a hyperplane that separates $\mathbf{b}$ from $S$: $\mathbf{y}^\intercal \mathbf{b} < \alpha$, but $\mathbf{y}^\intercal s \ge \alpha$ $\forall s \in S$.
- $0 \in S \Rightarrow \alpha \le 0 \Rightarrow \mathbf{y}^\intercal \mathbf{b} < 0$.
- $\mathbf{y}^\intercal A \mathbf{x} \ge \alpha$ $\forall \mathbf{x} \ge 0$ implies $\mathbf{y}^\intercal A \ge 0$ since $\mathbf{x}$ can be arbitrarily large.
Proof of alternative
Apply Farkas' lemma to
\begin{align*} \text{(I)} \quad \exists \mathbf{x} \in \Re^n & &&&&& \text{(II)} \quad \exists \mathbf{y} \in \Re^m & \\ \text{s.t. } A\mathbf{x} + \texttt{I} \mathbf{s} & \le b &&&&& \text{s.t. } A^\intercal \mathbf{y} & \ge 0 \\ \mathbf{x},\ \mathbf{s} &\ge 0 &&&&& \texttt{I} \mathbf{y} &\ge 0 \\ &&&&&& \mathbf{y}^\intercal \mathbf{b} & < 0 \end{align*}¶
¶
Duality Theory¶
$p^* = d^*$ if both primal and dual problems are feasible.
Weak Duality $p^* \le d^*$¶
\begin{align*} p^* & = \mathbf{c}^\intercal \mathbf{x} = \mathbf{x}^\intercal \mathbf{c} && \text{since it is a scalar} \\ & \le \mathbf{x}^\intercal \left( A^\intercal \mathbf{y}\right) && \text{because of dual constraint and $\mathbf{x} \ge 0$}\\ & = \left( \mathbf{x}^\intercal A^\intercal \right) \mathbf{y} = \left( A \mathbf{x}\right)^\intercal \mathbf{y} \\ & \le \mathbf{b}^\intercal \mathbf{y} = d^* && \text{because of primal constraint} \end{align*}Strong Duality $p^* = d^*$¶
Suppose $p^* < \alpha$. We show $d^* < \alpha$.
\begin{align*} \text{(I)} \quad \exists \mathbf{x} \in \Re^n & &&&&& \text{(II)} \quad \exists \mathbf{y} \in \Re^m,\ & z \in \Re\\ \text{s.t. } A\mathbf{x} & = \mathbf{b} &&&&& \text{s.t. } y^\intercal A - z \mathbf{c}^\intercal & \ge 0\\ -\mathbf{c}^\intercal \mathbf{x} + s & = - \alpha &&&&& \mathbf{y}^\intercal \mathbf{b} -z \alpha & < 0\\ s, \mathbf{x} & \ge 0 &&&&& z & \ge 0 \end{align*}where $s, z \in \Re$.
The matrix form of $\text{(I)}$ and $\text{(II)}$:
$\text{(I)}$ | $\text{(II)}$ |
\begin{align*}\left( \begin{array}{ll} A & \mathbf{0} \\ -\mathbf{c}^\intercal & 1 \end{array} \right)_{(m+1) \times (n+1)} \left( \begin{array}{l} \mathbf{x} \\ s \end{array} \right) &= \left( \begin{array}{l} \mathbf{b} \\ -\alpha \end{array} \right) \end{align*} | \begin{align*} \left( \mathbf{y}^\intercal, z \right)\left( \begin{array}{ll} A & \mathbf{0} \\ -\mathbf{c}^\intercal & 1 \end{array} \right) & \ge 0 \\ \left( \mathbf{y}^\intercal, z \right)\left( \begin{array}{l} \mathbf{b} \\ -\alpha \end{array} \right) & < 0 \end{align*} |
The definition of $\alpha$ that $p^* < \alpha$ $\Rightarrow$ $\text{(I)}$ is infeasible $\Rightarrow$ $\text{(II)}$ is feasible by Farkas’ Lemma. Let $\mathbf{y}, z$ be a solution to $\text{(II)}$. If $z \le 0$
then $\{\mathbf{y} \in \Re^m: A^\intercal \mathbf{y} \ge 0, \mathbf{y}^\intercal \mathbf{b} < 0\}$ is feasible.
Farkas’ Lemma (alternative) $\Rightarrow \{\mathbf{x} \in \Re^n: A \mathbf{x} = \mathbf{b}, \mathbf{x} \ge 0\}$ is infeasible.
contradiction since primal problem is assumed to be feasible.
Then $z > 0$.
We scale $\mathbf{y},\ z$ so that $y$ satisfies $\text{(II)}$ and $z = 1$.
Thus $\mathbf{y}$ is feasible to the dual problem and $\mathbf{y}^\intercal \mathbf{b} < \alpha$.
Since we have shown that $p^* < \alpha \Rightarrow d^* < \alpha$ for any $\alpha$, $p^* \le d^*$. Together with weak duality, we have $p^* = d^*$.
¶
The strong duality suggests $p^* = d^*$ in general even when they take infinity. The table summarize the strong duality in its diagonal entries where the blank cells are impossible scenarios for the primal and dual problems.
$p^* = -\infty \text{ (primal inf.)}$ | $p^* \text{ is finite}$ | $p^* = +\infty \text{ (primal unb.)}$ | |
---|---|---|---|
$d^* = -\infty \text{ (dual unb.)}$ | $\text{primal inf. and dual unb.}$ | ||
$d^* \text{ is finite}$ | $p^* = d^*$ | ||
$d^* = +\infty \text{ (dual inf.)}$ | $\text{primal and dual inf.}$ | $\text{primal unb. and dual inf.}$ |
The only exception is at the left-bottom corner that both primal and dual are infeasible. Here is an example: \begin{align*} p^* = \max_x \ & x & d^* = \min \ & y 1 \\ \text{s.t. } & 0 \cdot x = 1 & \text{s.t. } & y \cdot 0 \ge 1 \\ & x \ge 0 \end{align*}
Simplex Algorithm¶
The optimal solution of an LP can be obtained by the simplex method. We use the previous example to demonstrate the algorithm.
Step 1: \begin{align*} \begin{array}{lrlrlrlrlrlrl} \max & 80x_1 & + & 129x_2 & \\ \text{ s.t.} & 5x_1 & + & 6x_2 & + & x_3 & & & & & & & = 10000 \\ & x_1 & + & x_2 & & & + & x_4 & & & & & = 3000 \\ & x_1 & & & & & & & + & x_5 & & & = 600 \\ & & & x_2 & & & & & & & + & x_6 & = 1200 \\ & \mathbf{x} & \ge 0 &&&&&&& \\ \text{solution} & 0 & & 0 & &10000& & 3000& & 600 & & 1200& \\ \end{array} \end{align*}
- The solution is feasible, but certainly not optimal
- Increasing the value of either $x_1$ or $x_2$ gives better objective value
Step 2: \begin{align*} \begin{array}{lrlrlrlrlrlrl} \max & 80x_1 & + & 129x_2 & \\ \text{ s.t.} & 5x_1 & & & + & x_3 & & & & & - &6x_6 & = 2800 \\ & x_1 & & & & & + & x_4 & & & - & x_6 & = 1800 \\ & x_1 & & & & & & & + & x_5 & & & = 600 \\ & & & x_2 & & & & & & & + & x_6 & = 1200 \\ & \mathbf{x} & \ge 0 &&&&&&& \\ \text{solution} & 0 & & 1200 & &2800 & & 1800& & 600 & & 0 & \\ \end{array} \end{align*} Objective: $129\times 1200 - 129x_6 + 80x_1$.
Step 3: \begin{align*} \begin{array}{lrlrlrlrlrlrl} \max & 80x_1 & + & 129x_2 & \\ \text{ s.t.} & x_1 & & & + &1/5x_3 & & & & & - &6/5x_6 & = 560 \\ & & & & &-1/5x_3& + & x_4 & & & + &1/5x_6 & = 1240 \\ & & & & &-1/5x_3& & & + & x_5 & + &6/5x_6 & = 40 \\ & & & x_2 & & & & & & & + & x_6 & = 1200 \\ & \mathbf{x} & \ge 0 &&&&&&& \\ \text{solution} & 560 & & 1200 & &0 & & 1240& & 40 & & 0 & \\ \end{array} \end{align*} Objective: $129\times 1200 + 80\times 560 - 16x_3 - 33x_6 = 199,600 - 16x_3 - 33x_6$.
Consider the LP \begin{align*} p^* = \max_{\mathbf{x} \ge 0} \ & \mathbf{c}^\intercal \mathbf{x} \\ \text{s.t. } & A\mathbf{x} = \mathbf{b} \end{align*}
Let $B, N$ be a parition of the set $\{1,\ldots,n\}$. We refer $B$ as basis and $N$ as non-basis. The simplex algorithm is to find an optimal basis that $x_i = 0$ for all $i \in \text{non-basis indices}$. For both primal and dual problems: \begin{align*} p^* = \max_{\mathbf{x}} \ & \mathbf{c}^\intercal \mathbf{x} & d^* = \min \ & \mathbf{y}^\intercal \mathbf{b} \\ \text{s.t. } & A\mathbf{x} = \mathbf{b} & \text{s.t. } & \mathbf{y}^\intercal A \ge \mathbf{c}^\intercal \\ & \mathbf{x} \ge 0 \end{align*}
we conduct the basis and non-basis decomposition \begin{align*} A \mathbf{x} &= (A_B, A_N) \left( \begin{array}{c} \mathbf{x}_B \\ \mathbf{x}_N \end{array} \right) = A_B \mathbf{x}_B + A_N \mathbf{x}_N = A_B ( \mathbf{x}_B + A^{-1}_B A_N \mathbf{x}_N ) = \mathbf{b} \\ \Rightarrow x_B &= A^{-1}_B \mathbf{b} - A^{-1}_B A_N \mathbf{x}_N \\ \mathbf{c}^\intercal \mathbf{x} &= (\mathbf{c}_B, \mathbf{c}_N)^\intercal \left( \begin{array}{c} \mathbf{x}_B \\ \mathbf{x}_N \end{array} \right) = \mathbf{c}^\intercal_B \mathbf{x}_B + \mathbf{c}^\intercal_N \mathbf{x}_N = \mathbf{c}^\intercal_B ( A^{-1}_B \mathbf{b} - A^{-1}_B A_N \mathbf{x}_N ) + \mathbf{c}^\intercal_N \mathbf{x}_N \\ \mathbf{y}^\intercal A &= \mathbf{y}^\intercal (A_B, A_N) \ge (\mathbf{c}_B, \mathbf{c}_N)^\intercal \Rightarrow \mathbf{y}^\intercal A_B \ge \mathbf{c}_B^\intercal \text{ and } \mathbf{y}^\intercal A_N \ge \mathbf{c}_N^\intercal \end{align*} Hence \begin{align*} p^* = \mathbf{c}^\intercal_B A^{-1}_B \mathbf{b} + \max \ & (\mathbf{c}^\intercal_{N} - \mathbf{c}^\intercal_B A^{-1}_B A_{N}) \mathbf{x}_N & d^* = \min \ & \mathbf{y}^\intercal \mathbf{b} \\ \text{s.t. } & \mathbf{x}_B + A^{-1}_B A_N \mathbf{x}_N = A^{-1}_B \mathbf{b} & \text{s.t. } & \mathbf{y}^\intercal A_B \ge \mathbf{c}_B^\intercal\\ & \mathbf{x}_B, \mathbf{x}_N \ge 0 & & \mathbf{y}^\intercal A_N \ge \mathbf{c}_N^\intercal \end{align*}
For the optimal basis, we have
- primal solution $(\mathbf{x}_B, \mathbf{x}_N) = (A^{-1}_B \mathbf{b}, 0)$ and
- dual solution $\mathbf{y}^\intercal = \mathbf{c}^\intercal_B A^{-1}_B$
The basis $A_B$ is
- primal feasible if $A^{-1}_B \mathbf{b} \ge 0$ and
- dual feasible if $\mathbf{c}^\intercal_B A^{-1}_B A_{N} \ge \mathbf{c}_{N}$.
If $A_B$ is both primal and dual feasible, the primal and dual objective values are $\mathbf{c}^\intercal_B A^{-1}_B \mathbf{b}$. Thus we get an optimal solution.
We summarize the Primal and dual simplex algorithms (Phase 2) as in the following table by assuming a feasible basis is available (Phase 1).
Algorithm: | primal simplex | dual simplex | ||
---|---|---|---|---|
Initial: | A primal feasible basis $A_B$ | A dual feasible basis $A_B$ | ||
Optimality: | if $A_B$ is dual feasible | if $A_B$ is primal feasible | ||
Pricing: | select $r \in N$ with $\bar{c}_r > 0$ | select $s \in B$ with $\bar{b}_s < 0$ | ||
unbounded if $\bar{a}_r \le 0$ | infeasible if $\bar{a}_{sj} \geq 0$ $\forall j \in N$ |
where the LP basis formulation is
\begin{align*} p^* = \mathbf{c}^\intercal_B A^{-1}_B \mathbf{b} + \max & \quad \underbrace{(\mathbf{c}_{N} - \mathbf{c}_B A^{-1}_B A_{N})}_{\bar{\mathbf{c}}} \mathbf{x}_N \\ \text{s.t.} & \quad \mathbf{x}_B + \underbrace{A^{-1}_B A_N}_{\bar{a}} \mathbf{x}_N = A^{-1}_B \mathbf{b} { = \bar{\mathbf{b}}} \\ & \quad \mathbf{x}_B, \mathbf{x}_N \ge 0 \end{align*}Integer Programming¶
Optimization models in which some or all of the variables must be integer are known as mixed-integer programming (MIP). There are two main reasons for using MIP.
The decision variables are quantities that have to be integer, e.g., number of employee to assign or number of car to make
The decision variables are binaries to indicate whether an activity is undertaken, e.g., LaTeX: z=0 or LaTeX: 1 where 1 indicates the decision of renting an equipment so that the firm need to pay a fixed cost, or 0 otherwise.
If we relax the integrality requirements of MIP, then we obtain a so-called linear programming (LP) relaxation of a given MIP.
You might suspect that MIP would be easier to solve than LP models. After all, there are only a finite number of feasible integer solutions in an MIP model, whereas there are infinitely many feasible solutions in an LP model.
However, exactly the opposite is true. MIP models are considerably harder than LP models, because the optimal solution rarely occurs at the corner of its LP relaxation. The "integer feasible region" in the picture connects all integer feasible solutions.
The feasible region of a typical MIP model and its LP relaxation are shown in the picture.
Similar to LP, the optimal solution of the MIP occurs at the corner of the integer feasible region. However, as shown in the picture, the boundary of the integer feasible region is much more complicated than its LP relaxation feasible region.
Algorithms such as Branch-and-bound (B&B) are designed for solving MIP effectively and well implemented in commercial software. We use the following example to demonstrate the B&B algorithm. Consider an integer programming: \begin{align*} z_{IP} = \max \quad 7x_1 + 2x_2 &\\ -x_1 + 2x_2 & \le 4 \\ 5x_1 + x_2 & \le 20 \\ -2x_1 - 2x_2 & \le -7 \\ x & \in \mathbb{Z}^2_+ \end{align*}
Step 1: We ignore the integrality constraints $\in \mathbb{Z}^2_+$ and solve the problem as a LP \begin{align*} \begin{array}{rrrrrrr} x_3 & = & 4 & + & x_1 & - & 2x_2 \\ x_4 & = & 20 & - & 5x_1 & - & x_2 \\ x_5 & = & -7 & + & 2x_1 & + & 2x_2 \\ z & = & & & 7x_1 & + & 2x_2 \\ \\ x_3 & = & 8 & - & \frac{11}{5} x_2 & - &\frac{1}{5} x_4 \\ x_1 & = & 4 & - & \frac{1}{5}x_2 & - & \frac{1}{5} x_4 \\ x_5 & = & 1 & + & \frac{8}{5}x_2 & - & \frac{2}{5}x_4 \\ z & = & 28 & + & \frac{3}{5}x_2 & + & \frac{7}{5}x_4 \\ \\ x_2 & = & \frac{40}{11} & - & \frac{5}{11} x_3 & - &\frac{1}{11} x_4 \\ x_1 & = & \frac{36}{11} & + & \frac{1}{11} x_3 & - & \frac{1}{11} x_4 \\ x_5 & = & \frac{75}{11} & - & \frac{8}{11}x_3 & - & \frac{6}{11}x_4 \\ z & = & 30\frac{2}{11} & - & \frac{3}{11}x_3 & - & \frac{16}{11}x_4 \\ \end{array} \end{align*}
Step 2: Although the current solution is optimal for the LP relaxation, it is not optimal for the integer programming. Now, we branch on $x_2$. Since $x_2 = 40/11$ and $x_2$ is required to be an integer, we can consider two branches that $x_2 \ge 4$ or $x_2 \le 3$. For the first branch $x_2 \ge 4$, i.e. $x_2 - t = 4$, $t \ge 0$, we have \begin{align*} \begin{array}{rrrrrrr} x_2 & = & \frac{40}{11} & - & \frac{5}{11} x_3 & - &\frac{1}{11} x_4 \\ x_1 & = & \frac{36}{11} & + & \frac{1}{11} x_3 & - & \frac{1}{11} x_4 \\ x_5 & = & \frac{75}{11} & - & \frac{8}{11}x_3 & - & \frac{6}{11}x_4 \\ \color{red}{t} & \color{red}{= } & \color{red}{-\frac{4}{11}} & \color{red}{-} & \color{red}{\frac{5}{11}x_3} & \color{red}{-} & \color{red}{\frac{1}{11}x_4} \\ z & = & 30\frac{2}{11} & - & \frac{3}{11}x_3 & - & \frac{16}{11}x_4 \\ \end{array} \end{align*}
It is infeasible since the solution of $t$ is less than 0. Hence, the node is fathomed in the branch-and-bound tree as shown in the figure below.
For the branch $x_2 \le 3$, i.e. $x_2 + s = 3$, $s \ge 0$, we have \begin{align*} \begin{array}{rrrrrrr} x_2 & = & \frac{40}{11} & - & \frac{5}{11} x_3 & - &\frac{1}{11} x_4 \\ x_1 & = & \frac{36}{11} & + & \frac{1}{11} x_3 & - & \frac{1}{11} x_4 \\ x_5 & = & \frac{75}{11} & - & \frac{8}{11}x_3 & - & \frac{6}{11}x_4 \\ \color{red}{ s } & \color{red}{= } & \color{red}{-\frac{7}{11}} & \color{red}{+} & \color{red}{\frac{5}{11}x_3} & \color{red}{+} & \color{red}{\frac{1}{11}x_4} \\ z & = & 30\frac{2}{11} & - & \frac{3}{11}x_3 & - & \frac{16}{11}x_4 \\ \\ x_2 & = & 3 & & & + & s \\ x_1 & = & \frac{17}{5} & - & \frac{1}{5} x_4 & - & \frac{1}{5} s \\ x_5 & = & \frac{29}{5} & - & \frac{2}{5}x_4 & - & \frac{8}{5} s \\ x_3 & = & \frac{7}{5} & - & \frac{1}{5}x_4 & + & \frac{11}{5} s \\ z & = & \frac{149}{5} & - & \frac{7}{5}x_4 & - & \frac{3}{5} s \end{array} \end{align*}
Again, the solution is LP optimal, but not integer feasible. We then branching on $x_1$. As shown in the figure, we will get integer feasible solutions in both the branch $x_1 \le 3$ and $x_1 \ge 4$. The optimal one is the one associated with a larger objective value.
Lagrangian Relaxation¶
In Lagrangian relaxation, we obtain a relaxed problem by removing the difficult constraints from the problem and adding them to the objective function with some multipliers. The relaxation allows us to solve a series of simpler problems to obtain better and better approximations. Consider an optimization problem:
\begin{align*} \max \ & f(\mathbf{x}) \\ \text{s.t.} \ & c_i(\mathbf{x}) = 0 &&\forall i \in \mathcal{E}\\ & c_i(\mathbf{x}) \le 0 &&\forall i \in \mathcal{I} \end{align*}Let $\pmb{\lambda}$ be the Lagrange multipliers associated with the equality and inequality constraints. The Lagrangian function is
\begin{align*} \mathcal{L}(\mathbf{x}, \pmb{\lambda}) = f(\mathbf{x}) - \sum_{i \in \mathcal{E} \cup \mathcal{I}} \lambda_i c_i(\mathbf{x}) \end{align*}and the Lagrangian dual objective function is
\begin{align*} \mathcal{Z}(\pmb{\lambda}) = \sup_{\mathbf{x} \in \Re^n} \mathcal{L}(\mathbf{x}, \pmb{\lambda}) \end{align*}For any feasible $\bar{\mathbf{x}}$ and $\pmb{\lambda}$ with $\lambda_i \ge 0$ $\forall i \in \mathcal{I}$, we have
\begin{align*} \mathcal{Z}(\pmb{\lambda}) = \sup_{\mathbf{x} \in \Re^n} \left\{f(\mathbf{x}) - \sum_{i \in \mathcal{E} \cup \mathcal{I}} \lambda_i c_i(x)\right\} \geq f(\bar{\mathbf{x}}) - \sum_{i \in \mathcal{E} \cup \mathcal{I}} \lambda_i c_i(\bar{\mathbf{x}}) \geq f(\bar{\mathbf{x}}). \end{align*}The Lagrangian dual problem is \begin{align*} \max \ & \mathcal{Z}(\pmb{\lambda}) \\ \text{s.t.} \ & \lambda_i \ge 0 && \forall i \in \mathcal{I} \end{align*}
which gives the weak duality immediately as follows: \begin{align*} \min \left\{ \mathcal{Z}(\pmb{\lambda}): \lambda_i \ge 0 \ \forall i \in \mathcal{I}\right\} \ge \max \left\{ f(\mathbf{x}): c_i(\mathbf{x}) = 0 \ \forall i \in \mathcal{E}, \ c_i(\mathbf{x}) \le 0 \ \forall i \in \mathcal{I} \right\} \end{align*}
Linear Programming¶
We apply the Lagrangian relaxation to linear programming to derive the dual formulation and weak duality. Consider the LP
\begin{align*} p^* = \max_\mathbf{x} \ & \mathbf{c}^\intercal \mathbf{x} \\ \text{subject to } & A\mathbf{x} = \mathbf{b} \\ & \mathbf{x} \ge 0 \end{align*}Let $\mathbf{x}^*$ be the optimal solution for the primal problem. Then, for any $\mathbf{y} \in \Re^m$, \begin{align*} p^* = \left( c^\intercal \mathbf{x}^* - \mathbf{y}^\intercal (A\mathbf{x}^*-\mathbf{b}) \right) \end{align*}
because $A\mathbf{x}^* = b$. Hence, for any $\mathbf{y} \in \Re^m$, we have \begin{align*} p^* = \left( c^\intercal \mathbf{x}^* - \mathbf{y}^\intercal (A\mathbf{x}^*-\mathbf{b}) \right) & \le \max_{\mathbf{x} \ge \mathbf{0}} \left( \mathbf{c}^\intercal \mathbf{x} - \mathbf{y}^\intercal (A \mathbf{x}-\mathbf{b}) \right) \equiv \max_{\mathbf{x} \ge \mathbf{0}} \mathcal{L}(\mathbf{x}, \mathbf{y}) \\ \text{Then} \quad p^* & \le \min_{\mathbf{y}} \max_{\mathbf{x} \ge \mathbf{0}} \mathcal{L}(\mathbf{x}, \mathbf{y}) \\ & = \min_{\mathbf{y}} \max_{\mathbf{x} \ge \mathbf{0}} \left( (\mathbf{c}^\intercal - \mathbf{y}^\intercal A) \mathbf{x} + \mathbf{y}^\intercal \mathbf{b} \right) \\ & = \min_{\mathbf{y}} \begin{cases} \mathbf{y}^\intercal \mathbf{b} & \mathbf{c}^\intercal - \mathbf{y}^\intercal A \le \mathbf{0} \text{ with } \mathbf{x}^* = \mathbf{0}\\ \infty & \text{Otherwise} \end{cases} \end{align*}
\begin{align*} p^* \le \min_{\mathbf{y}} \max_{\mathbf{x} \ge \mathbf{0}} \left( (\mathbf{c}^\intercal - \mathbf{y}^\intercal A) \mathbf{x} + \mathbf{y}^\intercal \mathbf{b} \right) \end{align*}Note that $(\mathbf{c}^\intercal - \mathbf{y}^\intercal A)$ is a vector, and \begin{align*} (\mathbf{c}^\intercal - \mathbf{y}^\intercal A)\mathbf{x} = \sum_i (\mathbf{c}^\intercal - \mathbf{y}^\intercal A)_i x_i \end{align*}
For any given $\mathbf{y}$
if $(\mathbf{c}^\intercal - \mathbf{y}^\intercal A)_i > 0$, then the inner problem is unbounded by letting $x^*_i = \infty$.
if $(\mathbf{c}^\intercal - \mathbf{y}^\intercal A)_i < 0$, then the inner problem has $x^*_i = 0$.
if $(\mathbf{c}^\intercal - \mathbf{y}^\intercal A)_i = 0$, then $x^*_i$ can take arbitrary value.
Hence, continuing the previous inequality, we have
\begin{align*} p^* \le & \min_{\mathbf{y}} \max_{\mathbf{x} \ge \mathbf{0}} \left( (\mathbf{c}^\intercal - \mathbf{y}^\intercal A) \mathbf{x} + \mathbf{y}^\intercal \mathbf{b} \right) \\ = &\min_{\mathbf{y}} \begin{cases} \mathbf{y}^\intercal \mathbf{b} & \mathbf{c}^\intercal - \mathbf{y}^\intercal A \le \mathbf{0} \text{ with } \mathbf{x}^* = \mathbf{0}\\ \infty & \text{Otherwise} \end{cases} \\ = & \min_{\mathbf{y}} \ \mathbf{y}^\intercal \mathbf{b} \\ &\text{ s.t. } \mathbf{y}^\intercal A \ge \mathbf{c}^\intercal \end{align*}Integer Programming¶
As one of the most important applications of the Lagrangian relaxation, we show how Lagrangian relaxation can provide a bound for the overall integer problem. Consider a pure integer programming:
\begin{align*} &\text{maximize} & \mathbf{c}^\intercal \mathbf{x} \\ &\text{subject to} &A^1 \mathbf{x} &\le \mathbf{b}^1 && \text{complicateing constraints} \\ &&A^2 \mathbf{x} &\le \mathbf{b}^2 && \text{nice constraints} \\ && \mathbf{x} & \in \mathbb{Z}^n_+ && \end{align*}Let $\mathcal{Q} = \{\mathbf{x} \in \mathbb{Z}^n_+: A^2 \mathbf{x} \le \mathbf{b}^2 \} \neq \emptyset$. The (IP) problem can be rewritten as $z_{IP} = \max \{ \mathbf{c}^\intercal \mathbf{x}: A^1\mathbf{x} \le \mathbf{b}^1, \mathbf{x} \in \mathcal{Q} \} $. The Lagrangian relaxation is
\begin{align*} z_{LR}(\pmb{\lambda}) = \max \{ z(\pmb{\lambda},\mathbf{x}) = \mathbf{c}^\intercal\mathbf{x} + \pmb{\lambda}^\intercal(b^1 - A^1 \mathbf{x}): \mathbf{x} \in \mathcal{Q}\} \end{align*}and the Lagrangian dual is $z_{LD} = \min_{\pmb{\lambda} \ge 0} z_{LR}(\pmb{\lambda})$.
\begin{align*} z_{LR}(\lambda) =& \max_{x \in Q} (c-\lambda A^1) x + \lambda b^1 \\ =& \max_{x \in conv(Q)} (c-\lambda A^1) x + \lambda b^1 \\ =& \max_{x \in conv(Q)} cx + \lambda (b^1 - A^1x) \\ & \\ z_{LD} =& \min_{\lambda \ge 0} z_{LR}(\lambda) \\ =& \min_{\lambda \ge 0} \max_{x \in conv(Q)} cx + \lambda (b^1 - A^1x) \end{align*}We give a numerical example as follows.
\begin{align*} & \begin{array}{rrrrrr} \max & x_1 & + & x_2 & & \\ & -x_1 & + & 2x_2 & \le & 4 \end{array} \\ &\quad \left. \begin{array}{rrrrrr} % \max & 7x_1 & + & 2x_2 & & \\ % & -x_1 & + & 2x_2 & \le & 4 \\ % & 5x_1 & + & x_2 & \le & 20 \\ & -2x_1 & + & 2x_2 & \le & -7 \\ & -x_1 & & & \le & -2 \\ % & & & x_2 & \le & 4 \\ & x_1 & , & x_2 &\in & \mathbb{Z}_+ \end{array} \right \rbrace \mathcal{Q} = \{\mathbf{x} \in \mathbb{Z}^2_+: A^2\mathbf{x} \le \mathbf{b}^2\} \end{align*}Nonlinear Programming¶
In many complex optimization problems, the objective and/or the constraints are nonlinear functions of the decision variables. Such optimization problems are called nonlinear programming (NLP) models problems.
\begin{align*} \min \ & f(x) \\ \text{s.t.}\ & c_i(x) = 0 &&\forall i \in \mathcal{E}\\ & c_i(x) \ge 0 &&\forall i \in \mathcal{I} \end{align*}The KKT conditions is sufficient for the optimal solution \begin{align*} \nabla_x \mathcal{L}(x^*,\lambda^*) = \nabla f(x^*) - \sum_{i} \lambda^*_i \nabla c_i(x^*) & = 0 \\ c_i(x^*) & = 0 && i \in \mathcal{E} \\ c_i(x^*) & \ge 0 && i \in \mathcal{I} \\ \lambda_i^* & \ge 0 && i \in \mathcal{I} \\ \lambda_i^*c_i(x^*) & = 0 && i \in \mathcal{I} \\ \end{align*}
The strong duality of LP can be easily derived by using KKT conditions. Consider the LP \begin{align} p^* = \max_{\mathbf{x}} \ & \mathbf{c}^\intercal \mathbf{x} \nonumber \\ \text{s.t. } & A \mathbf{x} = \mathbf{b} \tag{Primal} \\ & \mathbf{x} \ge \mathbf{0} \nonumber \end{align}
The Lagrangian function \begin{align*} & \mathcal{L}(\mathbf{x}, \mathbf{y}, \mathbf{s}) = - \mathbf{c}^\intercal \mathbf{x} - \mathbf{y}^\intercal (\mathbf{b} - A\mathbf{x}) - \mathbf{s}^\intercal \mathbf{x}\\ \Rightarrow & \nabla_x \mathcal{L}(\mathbf{x}, \mathbf{y}, \mathbf{s}) = -\mathbf{c} + A^\intercal \mathbf{y} - \mathbf{s} = 0 \end{align*}
Hence, the KKT conditions give \begin{align*} \begin{aligned} A^\intercal \mathbf{y} - \mathbf{s} & = \mathbf{c} \\ A\mathbf{x} & = b \\ \mathbf{x} & \ge 0 \\ \mathbf{s} & \ge 0 \\ x_is_i & = 0 \quad \forall i = 1,\ldots,n \end{aligned} \Rightarrow \mathbf{c}^\intercal \mathbf{x}^* = (A^\intercal \mathbf{y}^* - \mathbf{s}^*)^\intercal \mathbf{x}^* = \mathbf{y}^{*\intercal} (A\mathbf{x}^*) = \mathbf{y}^{*\intercal}\mathbf{b} \end{align*}
A solution $(\mathbf{x},\mathbf{y},\mathbf{s}) \in \Re^n \times \Re^m \times \Re^m$ is optimal if it satisfies $A\mathbf{x} = \mathbf{b},\ \mathbf{y}^\intercal A - \mathbf{s}^\intercal = \mathbf{c}^\intercal$ and
\begin{align} \mathbf{s}^\intercal \mathbf{x} = 0 \label{eqn_lp_1}\\ \mathbf{x} \ge 0 \label{eqn_lp_2}\\ \mathbf{s} \ge 0 \label{eqn_lp_3} \end{align}Solutions along the search path of all algorithms satisfy all (in)equalities, except
primal simplex: \eqref{eqn_lp_3} is relaxed initially and optimum is found until it is satisfied.
dual simplex: \eqref{eqn_lp_2} is relaxed initially and optimum is found until it is satisfied.
primal-dual: \eqref{eqn_lp_1} is relaxed initially and optimum is found until it is satisfied.
Thus, we have
\begin{align*} p^* & \le \min_{y} \max_{x \ge 0} \left( c^\intercal x - y^\intercal (Ax - b) \right)\\ & = \min_{y} \max_{x \ge 0} \left( (c^\intercal - y^\intercal A) x + y^\intercal b \right) \\ & = \min_{y} \begin{cases} y^\intercal b & c^\intercal - y^\intercal A \le \mathbf{0} \text{ with } x^* = \mathbf{0}\\ \infty & \text{Otherwise} \end{cases} \end{align*}Note that $c^\intercal - y^\intercal A$ is a vector. Given $y$, if
\begin{align*} \text{$(c^\intercal - y^\intercal A)_i$ is positive} \end{align*}then, the inner maximization problem is unbounded by letting $x_i = \infty$. If
\begin{align*} \left(c^\intercal - y^{\intercal} A\right)_i < 0, \end{align*}then the optimal solution for the inner maximization problem has $x^*_i = 0$. If
\begin{align*} \left(c^\intercal - y^{\intercal} A\right)_i = 0, \end{align*}then $x^*_i$ can take arbitrary value, i.e. complementarity condition.
$\blacksquare$
\begin{align*} p^* & \le \min_{y} \begin{cases} y^\intercal b & c^\intercal - y^\intercal A \le \mathbf{0} \\ \infty & \text{Otherwise} \end{cases} \end{align*}We have weak duality
\begin{align*} p^* \le \min \ & y^\intercal b \\ \text{s.t. } & y^\intercal A \ge c^\intercal \end{align*}Suppose we have a subroutine which can find a solution of a set of linear equations. Design an algorithm to solve the optimization problem $\max \{c^\intercal x : Ax = b, x \in \mathcal{R}^n\}$.
We can solve the system of linear equations
\begin{align*} c^\intercal x &= b y^\intercal \\ Ax &= b \\ y^\intercal A - s^\intercal &= c^\intercal \\ x \ge 0, s &\ge 0 \end{align*}Interior point method
\begin{eqnarray*} A^T y - s & = & c \\ A x & = & b \\ x_i s_i & = & 0, i=1,\ldots, n\\ (x,s) & > & 0 \end{eqnarray*}\begin{align*} F(x,y,s) = \left[ \begin{array}{c} A^T y - s - c \\ Ax - b \\ XSe \end{array} \right] = 0, (x,s) > 0 \end{align*}Newton's method implies
\begin{align*} J(x,y,s) \left[\begin{array}{c} \Delta x \\ \Delta y \\ \Delta s \end{array} \right] = \left[\begin{array}{ccc} 0 & A^T & -I \\ A & 0 & 0 \\ S & 0 & X \end{array} \right] \left[\begin{array}{c} \Delta x \\ \Delta y \\ \Delta s \end{array} \right]= -F(x,y,s) = \left[ \begin{array}{c} 0 \\ 0 \\ -XSe \end{array} \right] \end{align*}$\blacksquare$