Consider the case where the edges in our search space have costs associated with them, and the goal is to find a path from the origin to the goal that minimizes the cumulative cost
Cumulative Best-First Search
We have a frontier set ordered by the cost of its members, initially empty
Place the origin in the frontier set, appending an empty path vector and a cost value of 0
Repeat until goal found of frontier empty:
If frontier empty, terminate with failure
Select the node, N, with minimal cost in frontier
Remove N from the frontier
Check if N is the goal. If it is, we terminate with success and return the path vector of N which contains the shortest path from origin to goal
For each neighbor, M, of N:
Calculate the path to M. This is the path to N plus N.
Calculate the cost of taking this path to M. This is the cost of getting to N plus the cost of the getting to M from N
If M is not in the visited set, place M in the frontier, associating with it the path and cost calculated
Place N in the visited set.
Example
Heuristic Search : A∗
Why Heuristic Search?
As we saw in the example, it helps a lot to be clever when breaking ties and putting nodes in the frontier set.
It would be very nice to introduce some bias in the search such that it tends towards the goal, as in, so that is doesn’t go crazy exploring in the opposite direction, but without compromising our optimality guarantees.
→ This idea leads to heuristic search
A∗ Search
We will change our cumulative best-first search algorithm such that the cost function, f, used to order the frontier is now the sum of two components , g and h
g : The cost of getting to the node from the origin by the route taken
h : The estimated cost of getting to the goal from the node via the optimal path. This estimate is generated using a chosen heuristic
To ensure that the path found is optimal, our heuristic, h, needs to satisfy some constraints.
If we are performing tree-search, we require that h is optimistic :
where h∗(N) is the true cost of getting from node N to the goal
If we are performing graph search, we need to ensure that the first time we visit a node we go to it by the shortest path, in which case we also require h to be monotonic
for all nodes N,M such that M is a neighbor of N and h∗(M)<h∗(N)
A heuristic is better if it is as close to the true value as possible.
→ In fact, cumulative best-first search is just the worst possible A∗ search, with a heuristic that always estiamtes the cost to the goal is 0
Example
Competitive Search
Traditional Game playing
Two players
Turn-taking
→ excludes : rock-paper-scissors
zero-sum
→ excludes cooperation, prisoner’s dilemma
Perfect information
→ state is visible
→ excludes : pocker
Deterministic
→ excludes : dice, cards
How can we model this type of game?
State space
Two kinds of nodes : player one moves (MAX player) , player two moves (MIN player)
States have values associated with them
High : good for player one
Low : good for player two
The MiniMax Algorithm
Finds the best move by a simple recursive computation of the best moves for each player given all possible successor states
Problem
The number of nodes involved grows exponentially with depth
the search space is too large to perform the complete algorithm
Alpha Beta Pruning
We can improve the MiniMax algorithm : at each state, only moves within a particular interval [alpha, beta] are interesting
From MAX’s point of view:
Values below alpha : found a better moves already
Values above beta : too good to be true! The opponent will never let the game reach the state required to play it.
Increases the searchable depth by order of magnitude
Procedure
alphabeta(node, depth, alpha, beta):
If node is terminal/leaf or depth is 0:
Return an evaluation of the node
Else if node is a max node:
For each child of node:
Let v = alphabeta (child, d - 1, alpha, beta)
If v > alpha, set alpha = v
If alpha ≥ beta, break
return alpha
Else
For each child of node:
Let v = alphabeta(child, d - 1, alpha, beta)
If c < beta, set beta = v
If alpha ≥ beta, break
return beta
Example
Issues
It can suffer from horizon effects (when the depth is limited).
– a bad thing will surely happen, but
– you can postpone it beyond the search depth
(for example in chess, sacrificing other pieces to put off having a doomed queen taken instead of letting the queen go sooner while keeping the other pieces = the inevitable is postponed at a cost)