Strategy: Traverse state space by considering only locally-reachable states
Typically, incomplete and **suboptimal
Any-time property: longer runtime, often better solution
Problem formulation is slightly different:
States: Represent different configurations or candidate solution. They may or may not map to an actual problem state, but are used to represent potential solution.
Initial State: The starting configuration (candidate solution)
Note how there are no action/transition properties in the formulation.
Question:
In local search, the solution is the final state rather than the path taken to reach it. Can local search be used to potentially find a good solution for shortest path problems?
An evaluation function is a mathematical function used to assess the quality of a state.
The function takes in a state as an argument and produces a number.
Important: An evaluation function should only return values between
value = eval(state)
This value can either be minimized or maximized.
Note: a minimization objective f(x) can be transformed into a maximization objective by defining a new objective f'(x) = -f(x) and vice versa
current_state = initial_state
while True:
best_successor = a **highest-eval** successor state of current_state
if eval(best_succcessor) <= eval(current_state):
return current_state
current_state = best_successor
A problem that involves multiple agents interacting within an environment, where each has it's own goal(s) and decision-making. Can be cooperative, competitive, mixed-motive.
Agents have conflicting goals.
Two-player zero-sum games, where one player's gain is another's loss.
Next state of environment depends not only on our action in the current turn, but also the other agents in the next turn, which are assumed to be strategic
A game tree is a graphical representation of all possible states and moves in a game. Not necessarily a tree.

The outcome of the game is displayed beneath each terminal node (end state).
We now have extra properties in the formulation:
Algorithm specifically for a two player zero-sum game
Core assumption: all players play optimally
Expand function: expand(state)
next_state = transition(state, action)Terminal function: is_terminal(state)
Utility function: utility(state)
def minimax(state):
(action, v) = max_value(state)
return action
def max_value(state):
if is_terminal(state): return utility(state)
v = neg_inf
for next_state in expand(state):
v = max(v, minimum value for player A in the next_state)
return v
As we can see, the max_value assumes that the opponent plays optimally by minimizing player A in the max_value function.
def min_value(state):
if (is_terminal(state)): return utility(state)
v = pos_inf
for next_state in expand(state):
v = min(v, max_value(next_state))
return v
So what actually happens in this algo is a step-by-step recursive dive to maximize the minima of a potential branch.
Time complexity: Exponential w.r.t. to depth.
Space Complexity: Polynomial.
Complete: Yes, if tree is finite.
Optimal: Yes, against optimal opponent.
Theorem: In any finite, two-player zero-sum game of perfect information, the minimax algorithm computes the optimal strategy for both players, guaranteeing the best achievable outcome for each, assuming both play optimally
Theorem: If Player A's opponent deviates from optimal play, then for any action taken by Player A, the utility obtained by Player A is never less than the utility they would obtain against an optimal opponent.
Minimax explores the entire game tree.
However, evaluating a node sometimes becomes unnecessary, as it won't alter the decision.
A variant of minimax that seeks to decrease the number of nodes that are evaluated.
Maintain the best outcomes so far for Player A and Player B:
Prune unnecessary nodes.
Psuedocode
def alpha_beta_search(state):
(action, v) = max_value(state, -inf (alpha), inf (beta))
return action
def max_value(state, alpha, beta):
if (is_terminal(state)): return utility(state)
v = -inf
for next_state in expand(state):
v = max(v, min_value(next_staet, alpha, beta))
alpha = max(alpha, v)
if (v >= beta): return v
return v
def min_value(state, alpha, beta):
if (is_terminal(state)): return utility(state)
v = inf
for next_state in expand(state):
v = min(v, max_value(next_state, alpha, beta))
beta = min(beta, v)
if (v <= alpha): return v
return v
Theorem: Alpha-beta pruning does not alter the minimax value, and the optimal move computed for the root node; it only reduces the number of nodes evaluated.
Making full use of alpha-beta pruning requires meta-reasoning: reasoning about which computations to do first (e.g. which branch to expand to first?)
Theorem: Good move ordering increases alpha-beta pruning effectiveness; with "perfect ordering", its complexity is
Instead of evaluating full tree,
Modification: simply start returning an eval function at cutoff:
Assigns score to a game state, indicating favourability to a player.
utlility(state)Heuristic function is an estimate of utility
Theorem: In a finite, two-player, zero-sum game, the minimax algorithm with cutoff and an evaluation function returns a move that is optimal with respect to the evaluation function at the cutoff
The move may be suboptimal with respect to the true minimax value