Order the Moves for Alpha-Beta

To see how much work is saved by alpha-beta, first notice that without alpha-beta doing a negamax search to depth d where each state has b possible moves costs b**d. The advantage of alpha-beta depends on the order the moves are searched in. If each side tries its best move first at each state, a tricky proof shows that the cost with alpha-beta will be b**(d/2). If each side orders its moves randomly, the cost with alpha-beta will be b**(2d/3). If the moves are chosen from worst to best, alpha-beta gives no savings; still, it is harmless.

It is important to notice that b**(d/2) is not the same as (b**d)/2. Alpha-beta gives an exponential speedup, letting the computer player search twice as deep as it could have before in the same amount of effort.

It is also true that b**(d/2) is a lot less than b**(2d/3), so we want each side to try their best move first at each state. Of course, if we knew the best move at a state, we would have no need for our search engine. We can use the state score, though, to sort the best looking moves first.

We want to at least get the b**(d/2) that random will give us, but right now our state score has a lot of ties. We will first shuffle the moves, then sort them by score. This will give a good chance of making the best move early.

Our negamax now looks like this:

negamax(s, d, a, b) = do
  if game_over(s) or d = 0 return value(s)
  S = [ move(s, m) | m in moves(s) ]
  sort S by value
  v = -1.0
  a0 = a
  for s0 in S do
    v = max(v, -negamax(s0, d - 1, -b, -a0))
    a0 = max(a0, v)
    if vb return b
  return v
We should also do something similar for best_move.

The algorithm is starting to be complex. It might be worth going back and seeing the small changes we made to get here.

prev index next