View all newsletters
Receive our newsletter - data, insights and analysis delivered to you
  1. Technology
  2. AI and automation
December 7, 2017updated 08 Dec 2017 11:33am

Google unveils ‘superhuman’ AI AlphaGo Zero

A novel deep neural network has won at chess with just four hours of self-training time.

By CBR Staff Writer

If robot Olympics were not entertaining enough, how about the robo chess championships? Google has created supreme algorithm AlphaGo Zero which reigns undefeated against the former world champion AI.

New research published on Cornell University’s ArXiv details how computer scientists from Google DeepMind improved their deep neural network enough to produce AI capable of “superhuman performance in many challenging domains” of chess and two other board games.

So who did AlphaZero Go defeat? The AI known as Stockfish 8 – also created at Google. The scientists refined search engine capability within existing neural networks by reducing the number of positions researched from tens of millions to tens of thousands. The improved deep neural network achieved the results with one minute or less of processing time per move.

Scientists emphasised the “tabula rasa reinforcement learning from games of self-play” as the AI’s most impressive quality. Little more than four hours after the neural network was taught the rules of chess, AlphaGo Zero had run enough scenarios of the game in its “mind” to outperform Stockfish; the new kid on the block won 28 games and drew the remaining 72.

AlphaGo Zero went on to vanquish leading AI Elmo in the game of Shogi, Japanese Chess. After only two hours of solitary training, Google’s algorithm won 90 out of 100 games, losing just eight times.

AlphaGo Zero

Scalability of AlphaZero with thinking time, measured on an Elo scale. a Performance of AlphaZero and Stockfish in chess, plotted against thinking time per move. b Performance of AlphaZero and Elmo in shogi, plotted against thinking time per move. (D. Silver et al)

 

To the uninitiated, chess strategy may seem a simple case of learning patterns and knowing how to identify traps. Yet the true complexity of the game makes it an excellent scenario in which to test deep learning capability. As is oft-cited, there are more possible moves on a chessboard than there are atoms in the observable universe.

Content from our partners
Unlocking growth through hybrid cloud: 5 key takeaways
How businesses can safeguard themselves on the cyber frontline
How hackers’ tactics are evolving in an increasingly complex landscape

“The AlphaGo Zero program recently achieved superhuman performance in the game of Go, by tabula rasa reinforcement learning from games of self-play,” wrote authors David Silver, Thomas Hubert, Julian Schrittwieser etc.

“Starting from random play, and given no domain knowledge except the game rules, AlphaZero achieved within 24 hours a superhuman level of play.”

IBM unveils ‘game-changing powerhouse’ for AI
Google AI creates novel neural network NASNet
Humans vs Machines: Playing field levels as AI advances

Google declined to comment until the full research is published in another journal.

A younger AI, AlphaGo, triumphed against 3-times European Champion, Mr Fan Hui, during October 2015.

The 5-0 defeat was the first ever against a Go professional, and the details were published in the scientific journal, Nature. Google notes on its DeepMind blog that AlphaGo “somehow taught the world completely new knowledge” in several games of Go in Seoul, South Korea, March 2016.

Alongside the AlphaGo Zero project, a Machine Learning API named SC2LE invites developers of the world to craft an algorithm to win at Blizzard Entertainment’s Starcraft. The original game is already used by ML researchers, who compete annually in the AIIDE bot competition.

Artificial Intelligence can be trained to perform extremely well in a mathematically limited set of circumstances. The real challenge of “machine intelligence” is to create generalised knowledge which can be applied in multiple situations. Although AlphaGo Zero is not capable of cooking dinner (or even unpacking a chess set), the fact that the same algorithm succeeded at three different games – with a relatively short period of self-training time – is a major step for science.

Websites in our network
Select and enter your corporate email address Tech Monitor's research, insight and analysis examines the frontiers of digital transformation to help tech leaders navigate the future. Our Changelog newsletter delivers our best work to your inbox every week.
  • CIO
  • CTO
  • CISO
  • CSO
  • CFO
  • CDO
  • CEO
  • Architect Founder
  • MD
  • Director
  • Manager
  • Other
Visit our privacy policy for more information about our services, how New Statesman Media Group may use, process and share your personal data, including information on your rights in respect of your personal data and how you can unsubscribe from future marketing communications. Our services are intended for corporate subscribers and you warrant that the email address submitted is your corporate email address.
THANK YOU