Informed (Heuristic) Search
1 Feb 2022
§ Motivation
Many AI search problems have search spaces that are exponential in size, then our main focus is to use domain knowledge to improve search efficiency.
Domain Knowledge
Domain knowledge refers to anything pertinent to solving a problem.
Examples of knowledge: rules of thumb, common solutions, problem decomposition into smaller subgoals, operator interaction/dependencies, etc.
§ Heuristic Functions
Heuristic Function
In our context, domain knowledge is given by a heuristic function
Ideally, is an estimate of the path cost remaining from a state to the closest goal.
Generally, and
By using a heuristic, we can guide the search away from unproductive areas of the search space.
Navigation
Oradea
*
/ \
/ \ * Neamt
/ \ \
Zerind * \ \
/ \ \
/ \ \
/ \ Siblu [goal] Fagaras * Iasi
Arad *---------------*---------------* \
| | \ \
| | \ \
| | \ \
| * Rimnicu Vilcea \ * Vaslui [goal]
* Timisoara |\ \ /
\ | \ \ /
\ | \ \ /
\ | \ \ /
\ | \ Petesti \ /
Lugoj * | *-------------------*-----------*-------------* Hirsova
| | / Bucharest | Urziceni \
| | / | \
Mehadia * | / | \
| | / Giurgiu * Eforie *
| |/
Dobreta *----------*
Craiova
Suppose we again want to navigate from Sibiu to Vaslui.
- Using BFS (FIFO) the frontier at each pass is given by:
- S | O, A, R, F | Z, T, C, P, B | L, D, U, G | M, H, V.
- Using DFS (LIFO), the sequence of states we visit:
- S, O, Z, A, T, L, M, D, C, R, P, B, F, G, U, H, E, V.
Now, consider the heuristic of the straight line distance, i.e. we use a priority queue and order them based on the straight line distance between a node and the goal.
- Using UC (PQ), the sequence of states we visit:
- S, F, B, U, V
Tile Puzzle
Initial Goal
.---+---+---. .---+---+---.
| 7 | 2 | 4 | | | 1 | 2 |
+---+---+---+ +---+---+---+
| 5 | | 6 | | 3 | 4 | 5 |
+---+---+---+ +---+---+---+
| 8 | 3 | 1 | | 6 | 7 | 8 |
'---+---+---' '---+---+---'
A simple heuristic for the tile puzzle would be the Hamming distance, or the number of disagreements between a given state and the goal.
- This is an underestimate as it will take more moves than the number of tiles out of place to get each tile in the proper place.
- While the heuristic can help differentiate between almost-solved and no-where-close states, it has no guarantee for "closeness", i.e. conveys that only one tile is out of place, but it does not convey with it the number of possible moves.
Another heuristic is the Manhattan distance.
- This method can underestimate the required number of moves, as the process of moving a tile can cause other tiles to be displaced from the correct position.
- Likewise, this method can also overestimate the number of moves --- moving a tile can possibly move multiple tiles into place.
§ Greedy Search
Taking the pseudocode for BestFirstSearch from Uninformed Search
function BestFirstSearch(problem, f: Function -> bool) -> Node? {
let node := Node(state: problem.initial)
let frontier := PriorityQueue(ordering: f, contents: [ node ])
let visited := HashTable([ problem.initial: node ])
while not frontier.empty() {
let node := frontier.pop()
if problem.goal_test(state: node.state)
return node
for child in problem.expand(node) {
let successor := child.state
if not visited.contains(successor) or child.cost < visited[successor].cost {
visited[successor] := child
frontier.push(child)
}
}
}
return None
}
Then, by passing as the functor f
in the BestFirstSearch algorithm, we order the priority queue in accordance to our heuristic.
Garden Path Problem
With greedy search, the heuristic can cause the algorithm to begin searching and goal testing nodes that do not belong to the minimal cost path.
Take the map from the navigation example, suppose we start from Vaslui and want to find a path to Oradea. The straight line distance heuristic may cause us to first check the path from Vaslui to Iasi and finally to Neamt, however, by just looking at the map we know that the path to Neamt is a dead end.
§ A* Algorithm
Like greedy search, A* builds off of BestFirstSearch, except instead of passing the heuristic as the function we let where is the path-cost thus far and is the heuristic --- except it is an admissible heuristic.
Admissible Heuristic
An admissible heuristic is a heuristic that never overestimates the true distance to a goal, and
While the heuristic may be inaccurate, having an admissible heuristic guarantees optimality of A*.
However, using a nonadmissible heuristic in A* can make the search more efficient, i.e. it may become faster at finding a goal node but there are no guarantees on it being the minimal cost goal.
A* finds a goal with minimum path cost (A* is optimal).
Suppose the optimal goal is but the algorithm instead returns , where
Let be a node in the frontier (priority queue) that also lies on the optimal path to
Assuming that we use an admissible heuristic,
Contradiction. Therefore, as well as all other nodes along the minimum cost path to will be dequeued before
scores monotonically increase along any path from the root.
A* explores states in order of increasing
§ A* Analysis
Generally, the more accurate the heuristic, the faster the search.
- For the boundary case of A* degenerates into uniform cost and has a time complexity of
- For a perfect heuristic A* will be led directly to the goal and have a time complexity that is linear in the path length.
For a heuristic with a bounded inaccuracy, A* will be subexponential.
First, we define "relative error" to be
where is the optimal heuristic.
Then, A* has a time complexity of
where is the path length to the goal
If then A* will search a subexponential number of nodes before finding the optimal goal --- however, in practice this is rarely attainable.