An AI has beaten a human fighter pilot to emerge as the Top Gun in a virtual dogfight which also has been hailed as a victory for the reinforcement learning technique.

The algorithm from US defense contractor Heron Systems emerged victorious from the US military‘s AlphaDogfight challenge. It defeated seven other AIs piloting simulated F-16 fighters, before taking on an experienced pilot wearing a virtual reality headset and piloting a simulator. The final score was a 5-0 win for the machine over its human counterpart.

Heron trained its system using the reinforcement learning technique, where the algorithm is asked to perform the same action over and over again in a virtual world until it gains something akin to an understanding of the task. Runner-up Lockheed Martin also used similar techniques, suggesting the method could play a key role in future development of autonomous air-to-air combat.

Ben Bell, Heron’s lead engineer for machine learning, said training the AI involved creating a training environment featuring a “league” of 100 unique agents.

Speaking at a Q&A after the dogfight, he said: “We didn’t try to combine any kind of expert systems. Our advantage was only using reinforcement learning so we tried to create an action scheme where the neural network could control the plane in a way that was smooth, which you didn’t see our competitors doing, and at a high enough rate where it could make some high aspect shots.

“Our training architecture is the second most important thing we did differently. We started off early with a league of agents. We want to create multiple different agents that fly in different patterns, with different reward structures and different neural network architecture. So the final agent we used has trained against 102 other totally unique agents, and that made us robust enough to beat any opponent, even the human pilot.

“Even a week before trial one we had agents that weren’t very good at flying at all. It hasn’t been easy but we’ve managed to turn it round.”

The challenge was set by Defense Advanced Research Projects Agency (DARPA), as part of it’s Air Combat Evolution, or ACE, programme, which apparently “seeks to automate air-to-air combat and build human trust in AI as a step toward improved human-machine teaming.” We’re not sure if Maverick and Goose would approve.

Heron and the other competitors had a year to build their systems, and had to go through a series of qualifiers to reach the final trial.

“It’s been amazing to see how far the teams have advanced AI for autonomous dogfighting in less than a year,” said Colonel Dan “Animal” Javorsek, program manager in DARPA’s Strategic Technology Office.

Read more: “Data is Our Ammunition” – British Army CIO