9. Optimality Conditions for Constrained Optimization
-
728x90
반응형
Introduction
In constrained optimization problem, the constraints can be
Linear or non-linear
Equalities or inequalities
Generally, our objective function and constraints can be represented as follows
subject to
Therefore, it is much harder than unconstrained optimization.
In addition, we have to change the definition of local optimum slightly as follows:
x∗ is a local minimum if
where gi(x)=0,i=E and hi(x)≥0,i=I for some ϵ>0.
Optimality condition
Boundary point : It is optimal on the boundary if no feasible and descent direction with respect to the active constraints.
→ We define p to be a feasible direction at the point x if there exists some ϵ>0 such that x+αp∈S,∀0≤α≤ϵ
However, for non-linear equality, it could be feasible curves rather than feasible directions
Linear Equality Constraints
Null space Approach
Note that for all linear equality constraints can be converted into the following form:
subject to
Consider the general solution update concept. For feasibility, the search direction must be in the null space of A. (Assume, we know at least one feasible point)
Reformulation
Assume we already know a feasible solution x, i.e.
Take p∈N(A), then
By using this fact, we can get rid of the constraints
where columns in Z is the basis of N(A)
Therefore, the optimality conditions of unconstrained optimization for function ϕ with respect to v are as follows
∇ϕ(v)=ZT∇f(x)=0
∇2ϕ(v)=ZT∇2f(x)Z is positive semi-definite.
Lagrange multiplier Approach
If x∗ is a local optimum, then there exists λ∗ such that
In other words, the gradient is a linear combination of the rows of A
λ are known as Lagrange multipliers
How can we derive above formula?
subject to
Define Lagrangian function such that
A stationary point (x∗,λ∗) of the function has ∇L(x∗,λ∗)=0
Therefore, first-order condition for Lagrangian function is as follows
Linear Inequality Constraints
Introduction
subject to
An interior point : optimality conditions for unconstrained optimization
A boundary point : for which some constraints are active
Note : The feasible region is a polyhedron/polytope since its constraints are just a linear function.
If we know which constraints are active at optimum, then we can solve an equality problem.
Necessary Conditions
Introduction
subject to
Given a solution x∗, let A^ be the sub-matrix of rows of active constraints. Denote by b^ the corresponding right-hand side.
A local optimum must also be locally optimum to the restricted equality problem with the active constraints
Necessary Conditions
subject to
Complementary Slackness
By the condition above, we don’t need to explicitly split the constraints in to active and non-active ones. We just make its values to the non-negative.
Therefore, we can write the condition as
For an inactive row i in A, λi=0 by complementary slackness.
Putting the Pieces Together
If x∗ is a local optimum, there exists a vector λ∗ such that
First order condition
∇f(x∗)=ATλ∗ (this is equivalent to ZT∇f(x∗)=0)
λ∗≥0 (Dual feasibility)
λ∗(Ax∗−b)=0 (Complementary slackness conditions)
Second order condition
ZT∇2f(x∗)Z is positive semi-definite
where Z is a null-space matrix for the active constraints.
Sufficient Conditions
First order condition
Ax∗≥b
∇f(x∗)=ATλ∗ (this is equivalent to ZT∇f(x∗)=0)
λ∗≥0 (Dual feasibility)
strict complementary slackness
→ Complementary slackness condition에서 λi∗=0 이거나 aiTx∗−bi=0 둘 중 하나만 해당하는 경우를 strict complementary slackness라고 부른다.
Second order condition
ZT∇2f(x∗)Z is positive definite
where Z is a null-space matrix for the active constraints.
Interpretation of KKT condition and Duality
Let our original problem as follows
subject to
One way to make this inequality constraint to equality constraint is using a penalty function.
where
However, it has a problem because it is non-differentiable at x=0. This problem can’t be fixed even though we change ∞ to some large value. Therefore, we take a linear function as a penalty function.
where λi≥0
It is quite natural. If we choose x that doesn’t satisfy the constraints, it rewards. On the other hand, if we choose x that doesn’t satisfy the constraints, it penalize.
The problem is that it could change the optimal value. If we take the maximum of λi, we can act as a original penalty function. Therefore, the original function can be express as
Therefore, it is actually a min-max problem which means the order is very important.
But, what if we change the order like this?
In this case, min player is much simpler since it is just a unconstraint problem.
This changed order problem is called dual problem , and the original problem called primal problem.
Duality
Let’s consider a following general optimization problem
subject to
Similar to the penalty function, we multiply
where μj≥0
and move these constraints to the objective function
To make them as a dual problem, just switch the order of min and max
Note that
is called Lagrangian of the optimization problem. And
is called Lagrangian dual optimization problem.
So, why it is important? Since, it actually gives the lower-bound of the primal problem
Take arbitrary x,λ,μ
since
In addition,
That means
Therefore,
Actually, under some assumption, strong duality holds. This means that the optimal value of the primal solution and dual solution coincide.
Slater’s condition
For a convex optimization problem in the form
subject to
we have strong duality if is is strictly feasible, i.e.
∃x∈int(D) such that
In other words, for at least one point in the interior of the domain (not on the boundary of domain), all the inequality constraints holds strictly.
KKT condition
Stationarity condition
Primal feasibility
Dual feasibility
Complementary slackness
In addition, second order optimality condition is as follows
where S is determined by active constraints
Note
앞쪽에서 다뤘던 “ZT∇2f(x∗)Z is positive semi-definite”는 잘 생각해보면
임을 쉽게 알 수 있다. (이때, ∇xx2gi(x)=0 이다.)
이때, A의 row가 constraint들의 gradient에 대응된다는 점을 감안했을 때, N(A)는 tangent space에 대응된다.
Conclusion
Convex optimization problem
KKT condition is a necessary condition to get a optimal point.
If strong duality holds (Slater’s condition holds), KKT condition is a sufficient and necessary condition to get a optimal point.
Non-convex optimization problem
KKT condition is not a necessary condition to get a optimal point. That means it is possible that some points are not a optimal point even though it satisfy the KKT condition.
Non-linear Constraints
subject to
The optimality conditions remains the same. However the derivation is more complicated, in particular we may have to consider feasible curves, not just feasible directions.
Necessary Conditions for Equality Constraints
subject to
If x∗ is a local optimum and Z(x∗) is a null-space matrix for the Jacobian at x∗, then there exists Lagrange multipliers λ such that