Artificial Intelligence for Games

Prof. Dr. Ullrich Köthe, Prof. Dr. Carsten Rother, SS 2019
Thursdays, 14:00-16:00, Mathematikon A, 2rd floor, SR 2/103

Games have always been a favorite playground for artificial intelligence research. All major AI ideas have quickly found their way into game-playing agents. This seminar will review the most remarkable milestones of game AI, from simple rule-based solutions for Tic-Tac-Toe and Connect Four to super-human performance in Go and Poker. Following the evolution of these ideas is an entertaining tour-de-force through the history of artificial intelligence itself.

Mathematically inclined students from all fields are welcome to participate. Please register for the seminar via Müsli and send me an email with your favorite papers, especially if you want to present at the beginning of the semester. Every participant needs to give a 45 minute talk and hand-in a 10 page report. These give 4 credit points (according to computer science rules).

Schedule

18. April	Jacqueline Wagner: Programming a computer for playing Chess (Shannon) and Digital computers applied to games (Turing)	Slides \| Report
	Sebastian Einsiedler: Adversarial Search (Russell, Norvig)	Slides \| Report
25. April	Frank Gabel: Some Studies in Machine Learning Using the Game of Checkers (Samuel)	Slides \| Report
	Fabian Jäger: Checkers is solved (Schaeffer et al.)	Slides \| Report
2. May	Jessica Löhr: A review of game-tree pruning (Marsland)	Slides \| Report
	Qingyang Cao: Enhanced forward pruning (Winands et al.)	Slides \| Report
9. May	no seminar
16. May	Philipp de Sombre: Game theory	Slides \| Report
	Robert Klassert: Monte-Carlo Tree Search	Slides \| Report
23. May	Oliver Mautschke: Evolutionary algorithms for games	Slides \| Report
	Denis Zavadski: Introduction to Reinforcement Learning	Slides \| Report
6. June	Christian Buschmann: PacMan / Ms. PacMan	Slides \| Report
	Mustafa Ibrahim: BomberMan Tournament	Slides \| Report
13. June	Jannik Petersen: AWESOME (Conitzer and Sandholm)	Slides \| Report
	Carsten Lüth: Human-level control through deep reinforcement learning (Mnih et al.)	Slides \| Report
27. June	Stephen Schaumann: Robot juggling	Slides \| Report
	Kerstin Küsters: Jenga robot	Slides \| Report
4. July	Florian Brunner: Mastering the game of Go (Silver et al.)	Slides \| Report
	Philipp Wimmer: A general reinforcement learning algorithm (Silver et al.)	Slides \| Report
11. July	Johannes Daub: AlphaStar	Slides \| Report
	Lasse Becker-Czarnetzki: DeepStack	Slides \| Report
18. July	Alexander Römelt: Opponent modelling (Bakkes)	Slides \| Report
	Michael Hartmann: Level-0 methods for predicting human behavior (Wright et al.)	Slides \| Report
25. July	Yebei Hu: Robot soccer	Slides \| Report
	Steven Kollortz: OpenAI Five	Slides \| Report

Topics to Choose From:

Topic 1: The Beginnings

Games were already considered as important benchmarks by the pioneers of artificial intelligence in the 1950ies.

(speaker: Jacqueline Wagner) Claude Shannon (1950): "Programming a computer for playing Chess" (PDF)
(speaker: Jacqueline Wagner) Alan Turing (1953): "Digital computers applied to games" (PDF)
Newell, Allen and Shaw, John Calman and Simon, Herbert A (1958): "Chess-playing programs and the problem of complexity" (PDF)
(speaker: Frank Gabel) Samuel, A. L. (1959): "Some Studies in Machine Learning Using the Game of Checkers" (PDF)

Topic 2: In Search for Perfect Play

Initially, game AI research focused on algorithms that would achieve provably perfact play. The basic algorithms are called "minimax search" and "alpha-beta pruning". A number of relatively simples games could be fully solved this way.

(speaker: Sebastian Einsiedler) Chapter 5 "Adversarial Search" in Russell, Stuart J and Norvig, Peter (2016): "Artificial intelligence: a modern approach" (PDF) describes the basic theory and applies it to tic-tac-toe and Chess
Victor Allis (1988): "A Knowledge-based Approach of Connect-Four: The Game is Solved: White Wins" (PDF)
see also:
- Gilles Vandewiele (2017): "Creating the (nearly) perfect connect-four bot with limited move time and file size" (PDF)
- Pascal Pons (2019): "Solving Connect 4: How to build a perfect AI" (Blog, includes a very good introduction to the fundamental algorithms for game AI)
(speaker: Fabian Jäger) Schaeffer, Jonathan and Burch, Neil and Björnsson, Yngvi and Kishimoto, Akihiro and Müller, Martin and Lake, Robert and Lu, Paul and Sutphen, Steve (2007): "Checkers is solved" (PDF)

Topic 3: Heuristics Speed-up Move Selection

Most interesting games are too complex to be solved exactly. Before the emergence of learning, heuristic search tree pruning was the best way to achieve good or even super-human strength with realistic resources.

(speaker: Jessica Löhr) Marsland, T Anthony (1986): "A review of game-tree pruning" (PDF)
Buro, Michael (1998): "From simple features to sophisticated evaluation functions" (PDF)
(speaker: Qingyang Cao) Tabibi, Omid David and Netanyahu, Nathan S (2002): "Verified null-move pruning" (PDF)
(speaker: Qingyang Cao) Winands, Mark HM and Van den Herik, H Jaap and Uiterwijk, Jos WHM and Van der Werf, Erik CD (2005): "Enhanced forward pruning" (PDF)
Robles, David and Lucas, Simon M (2008): "A simple tree search method for playing Ms. Pac-Man" (PDF)
(speaker: Robert Klassert) Chaslot, Guillaume and Bakkes, Sander and Szita, Istvan and Spronck, Pieter (2008): "Monte-Carlo Tree Search: A New Framework for Game AI" (PDF)
Browne, Cameron B et al. (2012): "A survey of Monte Carlo tree search methods" (PDF)
Schaeffer, Jonathan and Culberson, Joseph and Treloar, Norman and Knight, Brent and Lu, Paul and Szafron, Duane (1992): "A world championship caliber checkers program" (PDF) -- Chinook, the first program to almost beat checkers world champion Marion Tinsley
Campbell, Murray and Hoane Jr, A Joseph and Hsu, Feng-Hsiung (2002): "Deep Blue" (PDF) -- the first computer to beat Chess world champion Garry Kasparov
Korf, Richard E and Felner, Ariel (2002): "Disjoint pattern database heuristics" (PDF) -- efficiently solves the 15-puzzle game and its 24 pieces cousin

Topic 4: Machine Learning to the Rescue

Finding and implementing good heuristics is very difficult and time consuming. Learning a good playing strategy from high-level human games or from scratch will be much simpler, if it works.

(speaker: Denis Zavadski) Chapter 17 "Making Complex Decisions" and chapter 21 "Reinforcement Learning" in Russell, Stuart J and Norvig, Peter (2016): "Artificial intelligence: a modern approach" (PDF) describe the underlying learning theory
Tesauro, Gerald (1994): "TD-Gammon, a self-teaching backgammon program, achieves master-level play" (PDF) -- the first learning-based game AI that really worked
see also:
- Tesauro, Gerald (1995): "Temporal difference learning and TD-Gammon" (PDF)
- Tesauro, Gerald (2002): "Programming backgammon using self-teaching neural nets" (PDF)
Thrun, Sebastian (1995): "Learning to play the game of Chess" (PDF)
Chellapilla, Kumar and Fogel, David B (1999): "Evolving neural networks to play checkers without relying on expert knowledge" (PDF)
Buro, Michael (2002): "Improving heuristic mini-max search by supervised learning" (PDF)
(speaker: Jannik Petersen) Conitzer, Vincent and Sandholm, Tuomas (2007): "AWESOME: A general multiagent learning algorithm that converges in self-play and learns a best response against stationary opponents" (PDF)
Sabatelli, Matthia and Bidoia, Francesco and Codreanu, Valeriu and Wiering, Marco (2018): "Learning to Evaluate Chess Positions with Deep Neural Networks and Limited Lookahead" (PDF)
(speaker: Oliver Mautschke) Evelutionary algorithms for games (to be specified)

Topic 5: Learning for Video Games

Video games are an especially fruitful domain for learning. This is particularly challenging when the agent can only access the screen content, but not the internal game state.

(speaker: Christian Buschmann) Gallagher, Marcus and Ryan, Amanda (2003): "Learning to play Pac-Man: An evolutionary, rule-based approach" (PDF)
(speaker: Christian Buschmann) Szita, Istvan and Lorincz, Andras (2007): "Learning to play using low-complexity rule-based policies: Illustrations through Ms. Pac-Man (PDF)
Bellemare, Marc G and Naddaf, Yavar and Veness, Joel and Bowling, Michael (2013): "The arcade learning environment: An evaluation platform for general agents" (PDF) -- a software framework for easy experimentation with video games
Brockman, Greg and Cheung, Vicki and Pettersson, Ludwig and Schneider, Jonas and Schulman, John and Tang, Jie and Zaremba, Wojciech (2016): OpenAI GYM" (PDF) -- another software framework for easy experimentation with video games
(speaker: Carsten Lüth) Mnih, Volodymyr et al. (2016): "Human-level control through deep reinforcement learning" (PDF) -- human-level strength in ATARI games
Liang, Yitao and Machado, Marlos C and Talvitie, Erik and Bowling, Michael (2016): State of the art control of ATARI games using shallow reinforcement learning" (PDF)
(speaker: Steven Kollortz?) OpenAI (2018): "OpenAI Five" -- long blog about OpenAI's learning-based approach to playing Dota 2
(speaker: Johannes Daub) Oriol Vinyals et al. (2018): "Starcraft II: A new challenge for reinforcement learning" (PDF)
(speaker: Johannes Daub) Oriol Vinyals et al. (2019): "AlphaStar: Mastering the Real-Time Strategy Game StarCraft II" (PDF)
(speaker: Mustafa Ibrahim) BomberMan 2019 champions -- how did they do it? Review of the winning submissions to the final project of last semester's Fundamentals of Machine Learning lecture

Topic 6: The DeepMind Breakthroughs

AlphaGo and its successors lifted learning-based agent to an entirely new level. Especially remarkable is the success of AlphaZero, which mastered the games of Go and Chess in a fully unsupervised manner, just by playing millions of games against itself.

(speaker: Florian Brunner) Silver, David et al. (2016): "Mastering the game of Go with deep neural networks and tree search" (PDF)
(speaker: Philipp Wimmer) Silver, David et al. (2017): "Mastering the game of go without human knowledge" (PDF)
(speaker: Philipp Wimmer) Silver, David et al. (2017): "Mastering Chess and Shogi by self-play with a general reinforcement learning algorithm" (PDF)
(speaker: Philipp Wimmer) Silver, David et al. (2018): "A general reinforcement learning algorithm that masters Chess, Shogi, and Go through self-play" (PDF)

Topic 7: Incomplete Information Games (e.g. Poker)

If the game state is not fully visible (as in most card games), learning good play is even more challenging.

Sheppard, Brian (2002): "World-championship-caliber Scrabble" (PDF)
Billings, Darse and Davidson, Aaron and Schaeffer, Jonathan and Szafron, Duane (2002): "The challenge of Poker" (PDF)
Sandholm, Tuomas (2010): "The state of solving large incomplete-information games, and application to Poker" (PDF)
Heinrich, Johannes and Silver, David (2015): "Deep reinforcement learning from self-play in imperfect-information games" (PDF)
(speaker: Lasse Becker-Czarnetzki) Moravcik, Matej et al. (2017): "DeepStack: Expert-level artificial intelligence in heads-up no-limit Poker" (PDF)
Brown, Noam and Sandholm, Tuomas (2017): "Safe and nested subgame solving for imperfect-information games" (PDF)
(speaker: Steven Kollortz?) Brown, Noam and Sandholm, Tuomas (2018): "Superhuman AI for heads-up no-limit Poker: Libratus beats top professionals" (PDF)

Topic 8: Opponent Modelling

If the opponent doesn't play perfectly, exploiting his weaknesses can greatly enhence winning chances.

(speaker: Alexander Römelt) Bakkes, Sander CJ and Spronck, Pieter HM and Van Den Herik, H Jaap (2009): "Opponent modelling for case-based adaptive game AI" (PDF)
Ganzfried, Sam and Sandholm, Tuomas (2011): "Game theory-based opponent modeling in large imperfect-information games" (PDF)
Southey, Finnegan et al. (2012): "Bayes' bluff: Opponent modelling in poker" (PDF)
Hartford, Jason S and Wright, James R and Leyton-Brown, Kevin (2016): "Deep Learning for Predicting Human Strategic Behavior" (PDF)
(speaker: Michael Hartmann) James R. Wright and Kevin Leyton-Brown (2019): "Level-0 Models for Predicting Human Behavior in Games" (PDF)

Topic 9: Robots Playing in the Real World

Some people are crazy enough to set game playing robots loose.

(speaker: Stephen Schaumann) Schaal, Stefan and Atkeson, Christopher G (1994): "Robot juggling: implementation of memory-based learning" (PDF)
Todorov, Emanuel and Erez, Tom and Tassa, Yuval (2012): "MuJoCo: A physics engine for model-based control" (PDF) -- a sophisticated robot simulation environment
Kober, Jens and Bagnell, J Andrew and Peters, Jan (2013): "Reinforcement learning in robotics: A survey" (PDF) -- introduction to reinforcement learning from the robotics perspective
MacAlpine, Patrick and Depinet, Mike and Stone, Peter (2015): "UT Austin Villa 2014: RoboCup 3D simulation league champion via overlapping layered learning" (PDF) -- robots learning team play in a simulated football environment
Rodriguez, Diego et al. (2017): "Advanced soccer skills and team play of RoboCup 2017 TeenSize winner NimbRo" (PDF) -- a robot team playing football
Ficht, Grzegorz et al. (2017): "Grown-up NimbRo Robots Winning RoboCup 2017 Humanoid AdultSize Soccer Competitions" (PDF) -- another robot team playing football
Fazeli, N. and Oller, M. and Wu, J. and Wu, Z. and Tenenbaum, J. B. and Rodriguez, A. (2019): "See, feel, act: Hierarchical learning for complex manipulation skills with multisensory fusion" (PDF) -- a Jenga plaing robot

Topic 10: Game Theory

Covers some mathematical underpinnings of optimal game strategy.

(speaker: Philipp de Sombre) Nash, John (1951): "Non-cooperative games" (PDF) -- introduces the central concept of a Nash equilibrium, where no player can improve his strategy
see also:
- WikiPedia page on Nash equilibrium -- contains good explanations and further reading
Koutsoupias, Elias and Papadimitriou, Christos (1999): "Worst-case equilibria" (PDF)
Nisan, Noam and Ronen, Amir (2001): "Algorithmic mechanism design" (PDF) -- how to make agents collaborate?
Leyton-Brown, Kevin and Shoham, Yoav (2008): "Essentials of game theory: A concise multidisciplinary introduction" (PDF)