fi:Rn→R,i=1,…,m are the inequality constraint functions
hi:Rn→R,i=1,…,p are the inequality constraint functions
Optimal value
p∗=∞ if a problem is infeasible (no x satisfies the constraints)
p∗=−∞ if a problem is unbounded below
Optimal and locally optimal points
x is feasible if x∈dom f0 and it satisfies all constraints
a feasible x is optimalf0(x)=p∗; Xopt is the set of optimal points
x is locally optimal if there is an R>0 such that x is optimal for
subject to
Examples
f0(x)=1/x,dom f0=R++ : p∗=0, but it has no optimal point.
f0(x)=−logx,dom f0=R++: p∗=−∞
Implicit constraints
the standard form optimization problem has an implicit constraint
we call D the domain of the problem.
Feasibility problem
subject to
can be considered a special case of the general problem with f0(x)=0
Convex optimization problem
subject to
f0,…,fm are convex, h1,…,hp are affine
problem is quasi-convex if f0 is quasi-convex and f1,…,fm convex
often written as
subject to
equivalent problem : by solving one, we can construct the other solution
identical problem : objective and constraints are identical
Local and global optima
any locally optimal point of a convex problem is globally optimal
Optimality criterion for differentiable objective function
Suppose that the objective f0 in a convex optimization problem is differentiable, so that for all x,y∈dom f0,
Then x is optimal if and only if it is feasible and
for all feasible y
Proof
We already know that f0(y)≥f0(x)+∇f0(x)T(y−x)
If ∇f0(x)T(y−x)≥0 for all feasible y, then f0(y)≥f0(x) for all feasible y. Therefore, x is optimal.
Conversely, suppose x is optimal and there exists y is feasible region such that
That means y is located in the half-space that corresponds to the negative sign of the ∇f0(x). It contradicts the assumption.
Therefore, ∇f0(x)T(y−x)≥0 for all y in the feasible region.
unconstrained problem : x is optimal if and only if
equality constrained problem
subject to
(Assume that it is non-empty; otherwise the problem is unfeasible)
x is optimal if and only if there exists a ν such that
Proof
Since x,y is in the feasible region,
That means
Therefore, the first optimality condition can be written as
Since N(A) is a subspace,
That means
So,
Since N(A)⊥R(AT), (equivalently, R(AT) is row space of A)
Since dom A is a vector space, it is equivalent to say that
minimization over nonnegative orthant
subject to
x is optimal if and only if
Proof
The first order optimality condition is
for all feasible y
Since y is in the feasible region, y⪰0. So ∇f0(x)⪰0, it not ∇f0(x)Ty is not unbounded below. Then ∇f0(x)Ty≥0 for all feasible y.
Therefore
Since ∇f0(x)⪰0,x⪰0,
Equivalent convex problems
two problems are (informally) equivalent if the solution of one is readily obtained from the solution of the other, and vice-versa
some common transformations that preserve convexity
Eliminating equality constraints
subject to
is equivalent to
subject to
where F and x0 are such that
Actually let y is in the feasible region (i.e. Ay=b and fi(y)≤0,∀i)
Let N is a matrix which its columns is the linearly independent vector in N(A)
Introducing equality constraints
subject to
is equivalent to
subject to
Introducing slack variables for linear inequalities
subject to
is equivalent to
subject to
Epigraph form
standard form convex problem is equivalent to
subject to
Minimizing over some variables
subject to
is equivalent to
subject to
where fˉ0(x1)=infx2f0(x1,x2)
Quasi-convex optimization
subject to
with f0:Rn→R quasi-convex, f1,…,fm convex.
Unlike the convex optimization case, it can have locally optimal points that are not globally optimal.
Nevertheless, a variation of the first-order optimality condition does hold for quasi-convex optimization problems with differentiable objective functions.
In quasi-convex optimization problem, x is optimal if
Representation via a family of convex functions
It will be convenient to represent the sublevel sets of a quasi-convex function f via inequalities of convex functions. We seek a family of convex functions ϕt such that
Evidently ϕt must satisfy the property that for all x∈Rn,
for s≥t. This is satisfied if for each x, ϕt(x) is a non-increasing function of t (i.e. ϕs(x)≤ϕt(x) whenever s≥t.
If we take
it satisfies the above conditions.
Note that this representation is not unique, for example, if the sublevel sets of f are closed, we can take
Example
with p convex, q concave, and p(x)≥0,q(x)>0 on dom f0 can take ϕt(x)=p(x)−tq(x)
for t≥0,ϕt convex in x
p(x)/q(x)≤t if and only if ϕt(x)≤0
Quasi-convex optimization via convex feasibility problems
Let ϕt:Rn→R,t∈R be a family of convex functions that satisfy
and also, for each x, ϕt(x) is a non-increasing function of t, i.e.
Let p∗ is the optimal value of the quasi-convex optimization problem.
If the feasibility problem
subject to
is feasible, then we have p∗≤t. Conversely, if the problem is infeasible, then we can conclude p∗≥t.
This can be used as the basis of a simple algorithm for solving the quasi-convex optimization problem using bisection (solving a convex feasibility problem at each step)
Assume that the problem is feasible, and start with an interval [l,u] known to contain the optimal value p∗. When then solve the convex feasibility problem at its midpoint t=(l+u)/2, to determine wheter the optimal value is in the lower or upper half of the interval, and update the interval accordingly. This produces a new interval, which also contains the optimal value, but has half the width of the initial interval.
Basically, the idea is same as the parametric search algorithm.
Canonical Problems
Depending on the types of objective function and constraints, optimization problems can be categorized. We will delve into the following six subcategories
The above problems have the following inclusion relationships, and moving to the right, they can be seen as more generalized forms.
Linear program (LP)
subject to
Convex problem with affine objective and constraint functions (everything in a linear programming is affine)
feasible set is a polyhedron
But be cautious, as it is not as simple as it might think at first glance because the number of vertices will increase exponentially.
Examples
Piecewise-linear minimization
equivalent to an LP
subject to
Chebyshev center of a polyhedron
Chebyshev center of
is center of largest inscribed ball
aiTx≤bi for all x∈B if and only if
Hence xc,r can be determined by solving the LP
subject to
Linear-fractional program
subject to
where
It can be solved by bisection like this
If the feasible set is non-empty, the linear-fractional program can be transformed to an equivalent linear program
subject to
with variables y,z
Proof
If x is feasible in our original problem, then
is feasible in the LP problem, with the same objective value cTy+dz=f0(x). Therefore the optimal value of the original problem is greater than or equal to the optimal value of the LP problem. It is has a y,z corresponds to the optimal point of the original problem.
Conversely,
if (y,z) is feasible in the LP problem with z=0, then x=y/z is feasible in the original problem.
If (y,z) is feasible in the LP problem with z=0 and x0 is feasible for the original problem. Then x=x0+ty is feasible in the original problem for all t≥0. Moreover, limt→∞f0(x0+ty)=cTy+dz, so we can find feasible points in the original function with objective values arbitrarily close to the objective value of (y,z)
Generalized linear-fractional program
where dom f0(x)={x∣eiTx+fi>0,i=1,…,r}
a quasi-convex optimization problem. Unlike the linear fractional problem, there is no way to directly convert to LP. However, since it is a quasi-convex problem, it can be solved by bisection.
Quadratic program (QP)
The convex optimization problem is called a quadratic program(QP) if the objective function is convex quadratic, and the constraint functions are affine.
subject to
where P∈S+n,G∈Rm×n and A∈Rp×n
In a quadratic program, we minimize a convex quadratic function over a polyhedron.
Note that if P∈S++n, a sublevel set is a elipsoid
Proof
Since P is symmetric and positive definite matrix, it has a orthonomal eigenvector. Let say eigenvectors {v1,…,vn}, and eigenvalue {λ1,…,λn}.
Let x=x1v1+⋯+xnvn
Therefore, it is a ellipsoid if it is not a empty set.
Quadratically constrained quadratic program (QCQP)
subject to
where Pi∈S+n.
If P1,…,Pm∈S++n, the feasible region is a intersection of m ellipsoids and an affine set.
Examples
Lest-squares
We already know that
and since
for all x, ATA is a positive semi-definite matrix.
Therefore
it is a Quadratic programming.
Actually, this problem is simple enough to have the well known analytical solution x∗=A†b (A† is a pseudo-inverse of A)
When linear inequality constraints are added, the problem is called constrained regression or constrained least-squares
As an example we can consider regression with lower and upper bounds on the variables
subject to
There is another example like this
where x1≤x2≤⋯≤xn
Actually, it is just adding a set of linear inequalities. Therefore, it is a quadratic programming.
Second-order cone programming (SOCP)
subject to
where x∈Rn,Ai∈Rni×n,F∈Rp×n.
We call a constraint of the form
where A∈Rk×n, a second-order cone constraint because
If ci=0 for all i=1,⋯,m, then the SOCP is equivalent to a QCQP.
If Ai=0 for all i=1,…,m, then the SOCP reduces to a general LP
Let fi(x)=∥Aix+bi∥2−ciTx−di. Then, it is not differentiable for all x since it is not differentiable at Aix+bi=0. It might seems it is okay to neglect that points at the first glance, but we can’t because normally that the solution is very often at these points.
Examples
Robust linear programming
the parameters in optimization problems are often uncertain
subject to
there can be uncertainty in c,ai,bi
two common approaches to handling uncertainty (for simplicity, ai has only uncertainty)
deterministic model: constraints must hold for all ai∈Ei
subject to
Let an ellipsoid as Ei
where aˉi∈Rn,Pi∈S+n
Then our constraints can be expressed as
Therefore it can be represented as a SCOP
subject to
stochastic model: ai is random variable; constraints must hold with probability η
subject to
Assume ai is Gaussian with mean aˉi, covariance Σi (i.e. ai∼N(aˉi,Σi))
Then aiTx∼N(aˉiTx,xTΣix). So,
where Φ(x) is CDF of N(0,1)
Then it can be written as
subject to
If Φ−1(η)≥0 (i.e. η≥1/2), it is equivalent to the SOCP.
Geometric program (GP)
Monomial function
with c>0 and ai∈R, is called a monomial function
Posynomial function
it is just a sum of monomials
Geometric program (GP)
subject to
where f0 is a posynomial function, hi is a monomial function.
Geometric program in convex form
Geometric programs are not generally convex optimization problem, but they can be transformed to convex problems by a change of variables and a transformation of the objective and constraint functions.
monomial f(x)=cx1a1⋯xnan transforms to
where b=logc
posynomial f(x)=∑k=1Kckx1a1k⋯xnank transforms to
where bk=logck
Therefore, geometric program transforms to a convex problem
subject to
Generalized inequality constraints
One very useful generalization of the standard form convex optimization problem is obtained by allowing the inequality constraint functions to be vector valued , and using generalized inequalities in the constraints.
subject to
where f0:Rn→R,Ki⊂Rki are proper cones, and fi:Rn→RkiKi-convex.
Conic form problems
special case with affine objectives and constraints
subject to
extends linear programming (K=R+m) to non-polyhedral cones
Semidefinite programming
subject to
where G,F1,…,Fn∈Sk, and A∈Rp×n
Note that it includes problems with multiple LMI constraints
is equivalent to single LMI
Example
LP and equivalent SDP
LP
subject to
SDP
subject to
SOCP and equivalent SDP
SOCP
subject to
SDP
subject to
Proof
Before we start, we have to know the following fact (Schur complement of positive semidefinite matrix)
Then,
if D≻0 then
We already know that ciTx+di⪰0 by non-negativity of the norm. Therefore,
Since ciTx+di≥0,
Eigenvalue minimization
where A(x)=A0+x1A1+⋯+xnAn with given Ai∈Sk
we can express it as a SDP form
subject to
where x∈Rn,t∈R
it follows from the fact that
Proof
λ is an eigenvalue of A(x) if and only if λ−t is an eigenvalue of A(x)−tI.
Therefore, λmax(A(x))≤t if and only if all eigenvalues of A(x)−tI are non-positive. (i.e. A(x)−tI⪯0)
Matrix norm
where x∈Rn,t∈R,A(x)=A0+x1A1+⋯+xnAn with given Ai∈Rp×q
we can express it as a SDP form
subject to
the above matrix is positive semi-definite if
Since t is positive,
Therefore, all eigenvalues of A(x)TA(x) are less than equal to t2
Vector optimization
General vector optimization problem
subject to
vector objective f0:Rn→Rq minimized with respect to proper cone K∈Rq
Convex vector optimization problem
subject to
with f0K-convex, f1,…,fm convex.
Optimal and Pareto optimal points
set of achievable objective values
feasible x is optimal if f0(x) is a minimum value of O
x∗ is optimal since it is comparable to any other points in O and it is also less than any other point in O
feasible x is Pareto optimal if f0(x) is a minimal value of O
xpo is Pareto optimal since there is non-comparable point in O
Multi-criterion optimization
vector optimization problem with K=R+q
where we want all Fi’s to be small
feasible x∗ is optimal if
if there exists an optimal point, the objectives are non-competing
feasible xpo is Pareto optimal if
if there are multiple Pareto optimal values, there is a trade-off between the objectives.
Example : Regularized least-squares
Scalarization
Scalarization is a standard technique for finding Pareto optimal points for a vector optimization problem. To find Pareto optimal points: choose λ≻K∗0 and solve scalar problem
subject to
Scalarization for multicriterion problems
to find Pareto optimal points, minimize positive weighted sum