During practice, players first face simple one-player games, such as finding a purple cube or placing a yellow ball on red ground. They move on to more complex multiplayer games like Hide and Seek or Capture the Flag, where teams compete against each other to be the first to find and grab their opponent’s flag. The playing field manager does not have a specific objective but aims to improve the general abilities of the players over time.
Why is this cool? AIs like DeepMind’s AlphaZero have beaten the world’s best human players in chess and go. But they can only learn one game at a time. As DeepMind co-founder Shane Legg said when I spoke to him last year, it’s like have to swap your chess brain for your Go brain whenever you want to change the game.
Researchers are now trying to create AIs that can learn multiple tasks at once, which means teaching them soft skills that make it easier to adapt.
An exciting trend in this direction is open learning, where RNs are trained in many different tasks without a specific goal. In many ways, this is how humans and other animals seem to learn, via aimless play. But it requires a lot of data. XLand generates this data automatically, in the form of an endless stream of challenges. It is similar to POET, an AI training dojo where two-legged robots learn to overcome obstacles in a 2D landscape. The world of Xland is much more complex and detailed, however.
XLand is also an example of AI learns to be, or what Jeff Clune, who helped develop POET and leads a team work on this topic at OpenAI, calls AI-generating algorithms (AI-GA). “This work pushes the boundaries of AI-GA,” says Clune. “It’s very exciting to watch.”