Main Article Content
Go is a classic board game with relatively simple rules. However, it is very complex in practice due to the large number of alternatives to consider per move. Before 2015, most Go programs could only attain low amateur ranks. In 2016, the program AlphaGo beat the highly-regarded former world Go champion Lee Sedol 4 games to 1, the first time a player of Sedol’s calibre ever lost to a Go program. AlphaGo achieved its success through a combination of deep learning neural networks and reinforcement learning. The learning phase used thousands of hours of CPU time on Google’s higher-performance TPU clusters. Our project studies the techniques used by AlphaGo and Alpha Zero (a later improvement on AlphaGo), with an emphasis on improving both the learning speed and performance of the program. A key research question is whether the success of AlphaGo can be replicated using fewer computational resources.