Do Do-Undo

We have now build an adversary that is OK as far as complexity—we have done many of the exponential prunes we can do. Our constant factors, though, are quite awful. The root cause of this is that we make lots of copies of things in memory, and spend lots of time reading those copies. This means that we do many memory copies, nothing fits in the on-CPU cache, and that the Java garbage collector runs often.

To fix this, we will reuse the same state and the same scoring of that state over and over. The idea is to make a move on the state in place ("do"), then put the state back when done with it ("undo"). We can do the same thing with the value: keep track of the current state value, change it when a move is made ("do"), and put it back when the move is undone ("undo"). The result will only be a linear speedup, but it will be a big linear speedup that will be well worth the work.

First, modify your move method to change the current state and return what piece, if any, was captured. Since right now our value function is just about the number of pieces on each side, we will be able to tell how to change it from the return on move. It will be easiest to change your code so that the values go from -26 to 26 rather than -1.0 to 1.0. Then you can just subtract piece values as pieces are captured. This will also remove the floating point from your code, which might or might not help speed it up.

Next, add an unmove method to State that takes the piece captured as an argument.

Now modify negamax once again. Add a value argument, and change the moves to be do-undo. You will do-undo twice: once when sorting the moves (not states any more), and once when doing the recursive call. You want to modify best_move as well, so make the sorting of moves a separate function.

sorted_moves_from(s, val) = do
  M = []
  for m in moves(s) do
    p = s.move(m)
    v0 = val - piece_value(p)
    append <move = m, value = v0> to M
    s.unmove(m, p)
  shuffle(M)
  sort M by value
  return moves(M)

negamax(s, d, a, b, val) = do
  if game_over(s) or d = 0 return val
  M = sorted_moves_from(s, val)
  v = -1.0
  a0 = a
  for m in M do
    p = s.move(m)
    v0 = val - piece_value(p)
    v = max(v, -negamax(s, d - 1, -b, -a0, v0))
    a0 = max(a0, v)
    s.unmove(m, p)
    if vb return b
  return v
The same changes could be made to best_move, although it matters less.

Be careful! It is important that a call to negamax or best_move leave its state argument the same as it found it. The recursion through negamax depends on it.


prev index next