Computer Science/Optimization

7. Derivative-Free Methods

  • -
728x90
반응형

Motivation

So far, all methods considered require at least first-order derivative

But what if

  1. The function is not differentiable everywhere?
  1. It is very time-consuming to perform derivative calculation?

That is why Derivative-Free methods came.

→ search methods without the use of any derivative information

Compass Search

  • Intuition : Try a few points around the current solution
    • Better point found : move, and repeat
    • Otherwise : reduce the scope
  1. A pattern of trial points; one along each dimension and direction (”compass”)
  1. Reduce the pattern size (step size) in case of no improvement

Example

Algorithm

  1. Specify an initial guess x0x_0, an initial pattern size Δ0>0\Delta_0 > 0, and convergence tolerance Δtol\Delta_{tol}
  1. For k=0,1,k = 0, 1, \dots
    1. If Δk<Δtol\Delta_k < \Delta_{tol}, stop
    1. Evaluate ff at the points xkx_k and xk+Δkdi,i=1,,2nx_k + \Delta_k d_i, \forall i = 1, \dots, 2n

      did_i has 1 or -1 in element ii and zero otherwise

      💡
      각 coordinate마다 +1, -1을 시도해보는 것으로 생각하면 된다.
    1. If f(xk+Δkdi)<f(xk)f(x_k + \Delta_k d_i) < f(x_k) for some ii, set xk+1=xk+Δkdix_{k + 1} = x_k + \Delta_k d_i and Δk+1=Δk\Delta_{k + 1} = \Delta_k
    1. Otherwise, set Δk+1=Δk/2\Delta_{k + 1} = \Delta_k /2

Convergence

If the assumptions about the objective function are satisfied, the compass search method is guaranteed to converge. A formal statement of the theorem is given below.

Let ff be a real-valued function of nn variables. Let x0x_0 be a given initial point and determine {xk}\{x_k\} using the compass search algorithm with initial pattern size Δ0\Delta_0. Assume that

  1. the set S={x:f(x)f(x0)}S = \{x : f(x) \le f(x_0)\} is bounded
  1. f\nabla f is Lipschitz continuous for all xx, that is
    f(x)f(y)Lxy\|\nabla f(x) - \nabla f(y)\| \le L\|x - y\|

    for some constant 0<L<0 < L < \infty

Let {xk}\{x_k\} be the set of unsuccessful iterations, i.e., the iterations where a better point can’t be found and Δk\Delta_k is reduced. Then

limkf(xk)=0\lim_{k \to \infty} \|\nabla f(x_k)\| = 0

It is also possible to determine the rate of convergence or the compass search algorithm.

If additional assumptions are made, then it can be shown that at the unsuccessful iterations,

xkxcΔk\|x_k - x^*\|\le c \Delta_k

for some constant cc that does not depend on kk. This is similar to, but not the same as, a linear rate of convergence. There are several differences between this result and linear convergence.

  1. We consider only unsuccessful iterations and ignore successful iterations.
  1. the result shows only that {xkx}\{x_k - x^*\} is bounded by a linearly convergent series. At each unsucessful iteration we divide Δk\Delta_k by 2, so the sequence {Δk}\{\Delta_k\} converges linearly to zero. This does not guarantee that {xkx}\{x_k - x^*\} converges linearly.

Nevertheless, it is similar to linear convergence, and this property is sometimes referred to as r-linear convergence.

반응형
Contents

포스팅 주소를 복사했습니다

이 글이 도움이 되었다면 공감 부탁드립니다.