Computer Science/Optimization

1. Basics

  • -
728x90
반응형

A Generic Optimization Model

minf(x1,x2,,xn)\min f(x_1, x_2, \dots, x_n)

subject to\text{subject to}

g1(x1,x2,,xn)b1g2(x1,x2,,xn)b2gm(x1,x2,,xn)bm\begin{align}g_1(x_1, x_2, \dots, x_n) \le b_1 \\ g_2(x_1, x_2, \dots, x_n) \le b_2 \\ \vdots \\ g_m(x_1, x_2, \dots, x_n) \le b_m \end{align}

In compact form

minf(x)\min f(x)

where xSx\in S (SS : the set of all feasible solutions permitted by the constraints)

Assumptions

  • Continuous variables, smooth functions (twice continuously differentiable unless stated otherwise)
  • Linear programming (LP) and Non-linear programming (NLP)
  • What do we mean by solving the problem?
    • LP : compute the optimum
    • NLP : hard in general; we focus on a local optimum

Boundary and Interior

  • Boundary of SS : formed by all feasible points with at least one binding constraint; binding = an inequality is satisfied as equality
  • Interior of SS : the rest of SS, or all feasible points without any binding constraint

Global and Local Optima

  • xx^* is a global minimum if f(x)f(x),xSf(x^*) \le f(x), \forall x\in S
  • xx^* is a local minimum if f(x)f(x),xSf(x^*) \le f(x), \forall x\in S such that d(x,x)ϵd(x^*, x) \le \epsilon

In general, it is difficult to find the global minimum.

Therefore, we will focus on finding a local minimum in nonlinear optimization problems.

How to Solve it?

  • Take a sequence of steps to improve the objective function
  • Convergence : A point that is locally optimal

General Algorithm

  1. Specify some initial guess of the solution x0x_0
  1. For k = 0, 1, …
    1. If xkx_k is locally optimal, stop
    1. Determine a search direction pkp_k
    1. Determine a step length αk\alpha_k, such that xk+1=xk+αkpkx_{k + 1} = x_k + \alpha_kp_k is a better solution than xkx_k
💡
Technically, we have to consider about the shape at the point.

Descent Direction

  • It is desirable to have descent direction as our search direction
  • Formal definition

    A direction pp of function ffat point xx is a descent direction, if there exists ϵ\epsilon such that f(x+αp)<f(x),0<αϵf(x+\alpha p) < f(x), 0 < \alpha \le \epsilon

Note

Any direction pp less than 9090^\circ from f(x)-\nabla f(x) is a descent direction

Hessian

Second-order derivative of a function ff in several variables is called the Hessian

2f=[2fx122fx1x22fx1xn2fx2x12fx222fx2xn2fxnx12fxnx22fxn2]\nabla^2f = \begin{bmatrix}\frac{\partial^2f}{\partial x_1^2} & \frac{\partial^2f}{\partial x_1\partial x_2} &\cdots & \frac{\partial^2f}{\partial x_1 \partial x_n}\\ \frac{\partial^2f}{\partial x_2\partial x_1} & \frac{\partial^2f}{\partial x_2^2} &\cdots & \frac{\partial^2f}{\partial x_2 \partial x_n} \\ \vdots & \vdots & \vdots & \vdots \\\frac{\partial^2f}{\partial x_n\partial x_1} & \frac{\partial^2f}{\partial x_n\partial x_2} &\cdots & \frac{\partial^2f}{\partial x_n^2}\end{bmatrix}

The Hessian is a square and symmetric matrix

It is useful for optimality check, and function approximation.

Taylor Series

  1. One dimension
    f(x0+p)f(x0)+pf(x0)+12p2f(x0)+f(x_0 + p) \approx f(x_0) + pf'(x_0) + \frac{1}{2}p^2f''(x_0) + \cdots
  1. Multiple dimensions
    f(x0+p)f(x0)+pTf(x0)+12pTf(x0)p+f(x_0 + p) \approx f(x_0) + p^Tf'(x_0) + \frac{1}{2}p^Tf''(x_0)p + \cdots

What about approximating ff at point x0x_0 with f(x0)+pTf(x0)f(x_0) + p^T\nabla f(x_0) and find the best possible pp for the approximation?

minpf(x0)+pTf(x0)\min_p f(x_0) + p^T\nabla f(x_0)

What about adding one more term in the approximation?

minpf(x0)+pTf(x0)+12pT2f(x0)p\min_p f(x_0) + p^T\nabla f(x_0) + \frac{1}{2}p^T\nabla^2f(x_0)p

Sensitivity (Conditioning)

  • There may be errors in the input data
  • Also, data calculations are not completely accurate by computers
  • Intuitively, we have an ill-conditioned problem if the solution is very sensitive to the change of data values
cond(A)=AA1\text{cond}(A) = \|A\|\cdot \|A^{-1}\|
💡
A singular matrix has condition number \infty.
반응형
Contents

포스팅 주소를 복사했습니다

이 글이 도움이 되었다면 공감 부탁드립니다.