Behind the scenes at Google DeepMind Challenge Match: AlphaGo versus Lee Se-dol
Rarely, if ever, do the worlds of algorithms and board games evoke images of two fighters in the ring, but it was billed as the ultimate challenge. Seven days, five matches, a nerve-wracking four to five hours per game – and a $1 million prize at stake.
On one side of the table, AlphaGo, a computer program developed by artificial intelligence researchers at Google DeepMind in London, England. On the other side, Lee Se-dol of South Korea, considered the world’s top-ranked (human) Go player.
“Computer strategies can be very different from human strategies,” said Chris Maddison, a Massey College fellow and PhD student in the department of computer science. “To measure the strength of a computer program you can see how often it wins against other programs, but it’s hard to know whether those metrics are even accurate – until you match a computer against a human.”
Maddison is among the research contributors to AlphaGo – a group that includes TV alumni Timothy Lillicrap (cognitive science) and Ilya Sutskever (computer science). He attended the Google DeepMind Challenge Match March 8 to 15 at the Four Seasons Hotel in South Korea, where AlphaGo dared its second human player, having previously beat European Go champion, Fan Hui, 5-0.
“Fan Hui is a professional, and a great player,” said Maddison. “But our next goal was to challenge Lee Se-dol, the strongest player of the last decade and an iconic figure in the Go world.”
Following DeepMind’s publication of AlphaGo’s performance in Nature Jan. 27, the professional Go community analyzed the results and determined the algorithm could beat Hui, but its performance in those games didn’t make it an obvious winner when matched against Se-dol, who holds the most Go titles of the past decade. From their estimates, AlphaGo had only a 5 to 10 percent chance at winning each game. But AlphaGo didn’t stop playing nearly six months ago.
“We had the games against Fan Hui as an anchor, where we knew AlphaGo’s strength in October. Going in to this match, we didn’t know its ability now, calibrated against a human player as a strong as Se-dol.”
The contest was live streamed on DeepMind’s YouTube channel and live broadcast across television networks in China, Korea and Japan. Over 60 million Chinese viewers took in the first match. Maddison spoke from the tournament to writer Nina Haikara about AlphaGo’s final result:
What was the reaction to AlphaGo winning its first match?
The overwhelming feeling in the post-match press conference was shock. Even after our success in October, when we challenged Fan Hui, no one expected AlphaGo to improve as much as it did. We had internal tests that gave us confidence in our chances, but the true test was playing against someone of the caliber of Lee Se-dol. As a team we certainly felt relief and excitement.
Fan Hui, who previously lost to AlphaGo, said its 37th move in its winning second match was strange, but beautiful. How did AlphaGo become a creative player?
AlphaGo uses a technology called neural networks to help it select moves. The first neural network, called the policy network, is trained to mimic the preference of human experts over possible moves in a current board position. AlphaGo uses the policy network as guide for exploring possible outcomes and finally evaluates those outcomes with a value network, which is trained to predict the chance of winning from a certain board position. Even though the search is initially guided by human-like preferences, AlphaGo can overwhelm that bias if it discovers moves that lead to better outcomes.
Song Taegon, a Korean commentator who ranks as a 9-dan intermediate amateur player of Go, said AlphaGo had viewers rethinking the game. Moves previously thought of as bad or poor moves helped AlphaGo win?
AlphaGo frequently makes stylistic choices that do not fit the conventional Go wisdom. Move 37 in the second match is a great example. I think it has the potential to challenge long-held beliefs about the game and, as commentator Michael Redmond has said, usher in a revolution in our understanding of Go. I am excited to see how these historic games can potentially help us deepen our understanding of this ancient and beautiful game.
Did losing to Se-dol in the fourth match make AlphaGo a more “human” player? Why do you think AlphaGo lost one game?
It is generally quite difficult in Go to decide which moves are decisive. It is equally difficult to tease apart exactly which weakness of AlphaGo lead to that loss. That is something we are excited to understand as researchers. The consensus seems to be that move 78 by Lee Se-dol, a move dubbed “God’s Touch” by Go professionals, was a turning point. Certainly it surprised AlphaGo, which thought it had a one-in-ten-thousand chance of being played. We still have a lot to learn and will be spending the next few weeks looking through the games to see how we can improve AlphaGo.
Games are ideal tests for artificial intelligence, and Go was previously thought impossible for a computer to learn. Are there any games left for artificial intelligence to tackle next?
In Go players have all of the information needed to pick a move. Games where information is hidden, like your opponent's hand in poker, are still quite difficult for computer scientists to tackle. So, there are still many interesting challenges for artificial intelligence in games.
Can we take insights from AlphaGo beyond game playing artificial intelligence research?
Many important problems in science or medicine can be framed as search problems. Although AlphaGo itself can’t even tie its shoelaces, the long-term goal is that the technology behind AlphaGo can be applied generically to help scientists make breakthroughs. Like anything it will take concerted effort and as a field we are still at the earliest stages, but the success of AlphaGo gives hope that we can help scientists and experts make significant progress in other areas.
What was it like attending the five matches in Seoul?
I think everyone there understood that history was being made. The media covered reactions ranging from excitement to sadness, and I certainly felt those as well. But in the midst of everything I was most struck by how personal and human it felt.
For the past two years the AlphaGo team held players like Lee Se-dol as monuments in our minds. Then suddenly we were there – competing and collaborating to produce beautiful games of Go, with one of the most brilliant players of modern history. It was such a fantastic honour.
Nina Haikara is a writer with the department of computer science in the Faculty of Arts & Science at the University of Toronto