Computer Science/Artificial Intelligence

2. Heuristics and Competitive Search

  • -
728x90
반응형

Shortest Path Problems

Consider the case where the edges in our search space have costs associated with them, and the goal is to find a path from the origin to the goal that minimizes the cumulative cost

Cumulative Best-First Search

We have a frontier set ordered by the cost of its members, initially empty

  1. Place the origin in the frontier set, appending an empty path vector and a cost value of 0
  1. Repeat until goal found of frontier empty:
    1. If frontier empty, terminate with failure
    1. Select the node, N, with minimal cost in frontier
      1. Remove N from the frontier
      1. Check if N is the goal. If it is, we terminate with success and return the path vector of N which contains the shortest path from origin to goal
      1. For each neighbor, M, of N:
        1. Calculate the path to M. This is the path to N plus N.
        1. Calculate the cost of taking this path to M. This is the cost of getting to N plus the cost of the getting to M from N
        1. If M is not in the visited set, place M in the frontier, associating with it the path and cost calculated
      1. Place N in the visited set.
💡
Dijkstra algorithm으로도 불린다.
💡
추가적으로 frontier node를 priority queue를 이용해서 구현한 상황이므로 Best-First Search인 것이다.

Example

Heuristic Search : A\text{A}^*

Why Heuristic Search?

As we saw in the example, it helps a lot to be clever when breaking ties and putting nodes in the frontier set.

It would be very nice to introduce some bias in the search such that it tends towards the goal, as in, so that is doesn’t go crazy exploring in the opposite direction, but without compromising our optimality guarantees.

→ This idea leads to heuristic search

A\text{A}^* Search

We will change our cumulative best-first search algorithm such that the cost function, f, used to order the frontier is now the sum of two components , g and h

f=g+hf = g + h

gg  : The cost of getting to the node from the origin by the route taken

hh  : The estimated cost of getting to the goal from the node via the optimal path. This estimate is generated using a chosen heuristic

💡
gg는 dijkstra에서 사용했던 cost이고 hh가 새로 추가된 사항이다. 도착지까지의 거리를 예측하는 함수로 해당 함수를 어떻게 잡느냐에 따라 달라지게 된다는 측면에서 heuristic한 것이다.

To ensure that the path found is optimal, our heuristic, h, needs to satisfy some constraints.

  1. If we are performing tree-search, we require that h is optimistic :
    h(N)h(N),for all nodes Nh(N) \le h^*(N), \text{for all nodes }N

    where h(N)h^*(N) is the true cost of getting from node NN to the goal

    💡
    모든 node들에 대해서 실제로 목적지까지 걸리는 cost보다 h의 값이 작아야된다는 의미이다.
  1. If we are performing graph search, we need to ensure that the first time we visit a node we go to it by the shortest path, in which case we also require h to be monotonic
    h(N)h(M)+cost(N,M)h(N) \le h(M) + cost(N, M)

    for all nodes N,MN, M such that MM is a neighbor of NN and h(M)<h(N)h^*(M) < h^*(N)

💡
일반적으로 무난한 선택지는 the minimal number of edges from a node to the goal along any path이다. 특히 grid의 경우에는 goal state까지의 Manhattan distance이다.

A heuristic is better if it is as close to the true value as possible.

→ In fact, cumulative best-first search is just the worst possible AA^* search, with a heuristic that always estiamtes the cost to the goal is 0

Example

Competitive Search

Traditional Game playing

  • Two players
  • Turn-taking

    → excludes : rock-paper-scissors

  • zero-sum

    → excludes cooperation, prisoner’s dilemma

  • Perfect information

    → state is visible

    → excludes : pocker

  • Deterministic

    → excludes : dice, cards

How can we model this type of game?

State space

  • Two kinds of nodes : player one moves (MAX player) , player two moves (MIN player)
  • States have values associated with them
    • High : good for player one
    • Low : good for player two

The MiniMax Algorithm

Finds the best move by a simple recursive computation of the best moves for each player given all possible successor states

💡
즉, MAX player와 MIN player는 각각의 상황에서 자신에게 유리한 선택을 하는 것이다.

Problem

  1. The number of nodes involved grows exponentially with depth
  1. the search space is too large to perform the complete algorithm

Alpha Beta Pruning

We can improve the MiniMax algorithm : at each state, only moves within a particular interval [alpha, beta] are interesting

From MAX’s point of view:

  • Values below alpha : found a better moves already
  • Values above beta : too good to be true! The opponent will never let the game reach the state required to play it.
💡
α\alpha : 현재까지 조사한 경로들 중에서 최소한 얻을 수 있는 값. 이 값은 MAX player가 자신의 최선을 선택을 갱신할 때마다 증가하게 된다.
💡
β\beta : 현재까지 조사한 경로들 중에서 최대치에 도달할 가능성이 있는 값이다. 이 값은 MIN player가 자신의 최선의 선택을 갱신할 때마다 줄어들게 된다.
  • Increases the searchable depth by order of magnitude
    💡
    Therefore, compared to minimax, alpha beta pruning increases the searchable depth by orders of magnitude
    💡
    트리의 일부를 탐색하지 않음으로써 계산량을 줄이고, 따라서 탐색 가능한 깊이를 증가시키게 된다는 의미이다.

Procedure

alphabeta(node, depth, alpha, beta):

If node is terminal/leaf or depth is 0:

Return an evaluation of the node

Else if node is a max node:

For each child of node:

  1. Let v = alphabeta (child, d - 1, alpha, beta)
  1. If v > alpha, set alpha = v
  1. If alpha ≥ beta, break

return alpha

Else

For each child of node:

  1. Let v = alphabeta(child, d - 1, alpha, beta)
  1. If c < beta, set beta = v
  1. If alpha ≥ beta, break

return beta

💡
현재 노드가 MAX 노드인 경우 alpha, beta값은 부모에게 받은 것이고, αβ\alpha \ge \beta 라는 의미는 이미 위에서 MIN node가 그보다 더 작은 옵션을 알고 있는 상황이므로 해당 노드로 내려오지 않을 것이다. 그러한 측면에서 가지치기를 진행한다는 것이다.
💡
현재 노드가 MIN 노드인 경우 alpha, beta값은 부모에게 받은 것이고, αβ\alpha\ge \beta라는 의미는 이미 위에서 MAX node가 그보다 더 큰 옵션을 알고 있는 상황이므로 해당 노드로 내려오지 않을 것이다. 그러한 측면에서 가지치기를 진행한다는 것이다.

Example

Issues

It can suffer from horizon effects (when the depth is limited).

– a bad thing will surely happen, but – you can postpone it beyond the search depth (for example in chess, sacrificing other pieces to put off having a doomed queen taken instead of letting the queen go sooner while keeping the other pieces = the inevitable is postponed at a cost)

💡
즉, depth를 제한시켜 놓는 경우 horizon effects 가 발생할 수 있다.
반응형
Contents

포스팅 주소를 복사했습니다

이 글이 도움이 되었다면 공감 부탁드립니다.