CS 410/510 Games: Machine Learning

Machine Learning

PSU CS410/510GAMES Lecture 7
May 16, 2002

  • More probability: perfect Yahtzee
    • Rules
      • Yahtzee
      • Yacht
    • One-player, maximize score
      • Retrograde analysis
      • Nonlinearity
      • Avoiding recomputation
    • More sophisticated
      • One-player, maximize winning chance
      • Two-player, maximize winning chance
  • Machine learning of games
    • Goal: learn state values (vs. learn moves)
    • Generic problems: opponent modeling, overtraining, feature selection
    • TD-Gammon: neural nets for evaluation
      • Gerry Tesauro, IBM: play random backgammon with neural net evaluator
      • Gradient descent weight propagation with TD(lambda) assigns errors to positions
      • Eventually used hand-coded features plus shallow search to improve play
      • Learning "doubling" is hard
    • Michael Buro: three strategies
      • Multi-ProbCut: learn reasonable AB windows
      • Opening Book: don't lose the same way twice
        • Idea: use PVS, but with leaves leading to demonstrated win/loss labeled with infinite scores. Win for side is loss for other side...
        • Learning: update the book as position values are "found". (Can always find a "best" position to play for unless tree value is -inf.)
        • Problems
          • Bad opponents mean bad Ws, bad plays means bad Ls
          • Expensive to build an opening book this way
      • GLEM: learning eval functions
        • Eval function: g(sum(w[i] * val(c[i]))) where c[i] = and(r[j])
        • Uses gradient descent to adjust weights: effectively "one-neuron net"
        • Features c[i] and relations r[i] are given by implementor
        • Strategy: get endgame weights and back weight assignments up toward opening
        • Problems: expense, overtraining, weight assignment
    • GAs for evaluation