Google researchers for gaming AI to improve enhanced learning ability

By David Pac Update 23 May 2019

Reinforcement learning (reinforcement learning) - a sub-field of machine learning - related to AI training techniques using 'rewards' to promote software policy towards target objectives Specifically. In other words, this is the process by which the AI will try different actions, learn each feedback whether or not it brings better results, and then reinforce the actions that have been triggered. use, ie redoing and modifying its algorithms automatically over multiple iterations gives the best results. In recent years, intensive learning has been exploited to model the impact of social rules, in order to create extremely good gaming models AI, or self-service programming robots. An active operation after unpleasant incidents about software.

images 1 of Google researchers for gaming AI to improve enhanced learning ability

Winnow uses computer vision to help cut waste in food processing

Although possessing high flexibility, can be applied in many different models and purposes, reinforcing learning techniques contain an unfortunate omission: It is less effective. In order to train an AI model with enhanced learning techniques, it requires a lot of different interactions in simulated or real world environments, much more so than when people need to learn a certain task. To overcome this problem, especially in the field of video games, artificial intelligence researchers at Google recently proposed using a new algorithm called Simulated Policy Learning (abbreviated to SimPLe), which uses simple video game models to learn as well as improve quality policies in the choice of reinforcing learning techniques.

The researchers described this algorithm in a new print article titled 'Model-Based Reinforcement Learning for Atari' (roughly translated: Learning to strengthen based on Atari's model), and simultaneously in a document that comes with open source.

'At a high level, the researchers' idea of developing SimPLe algorithms is to alternate between establishing a model of game characteristics and characteristics and using that model to optimize turns a policy (with enhanced learning techniques without models) in the game simulation environment. The basic principles behind this algorithm have been well established and used in a variety of intensive learning methods based on recent models, 'said Łukasz Kaiser and Dumitru Erhan scientists from the Google AI team.

Admire Nvidia's new AI application: Turn MS Paint-style doodle into an artistic 'masterpiece'

As the two researchers have explained, training an AI system to play games requires predicting the next frame structure of the target game, which is given by a series of frames and commands. combine (eg 'left', 'inside', 'right', 'forward', 'backward'). Besides, the researchers also pointed out that a successful model can create the 'orbits' that can be used in training game agent program policies, which will reduce the need for based on complex calculations in the game.

images 1 of Google researchers for gaming AI to improve enhanced learning ability

SimPLe algorithm does exactly this. It takes 4 frames as input data to predict the next frame along with the reward, and after being fully trained, the algorithm will generate 'rollouts' - the sequence of action sequences, observation and results - used to improve policy (2 experts Kaiser and Erhan note that SimPLe algorithm only uses medium length rollouts to minimize predictive errors).

In lengthy tests equivalent to 2 hours of play (100,000 interactions), agency programs (agents) with SimPLe's adjusted policy have achieved maximum scores in two test games (Pong and Freeway), while creating near-perfect predictions of up to 50 steps in the future.

images 1 of Google researchers for gaming AI to improve enhanced learning ability

91% of technology managers believe that AI will be the center of the next technology revolution

Sometimes two researchers also try to gather small but highly relevant details in games, resulting in failure. Kaiser and Erhan concede that this algorithm is still not really consistent with the performance of standard reinforcement learning methods. However, SimPLe is able to deliver twice as much training, and the team hopes future research will help improve the algorithm's performance significantly.

'The main goal of model-based intensive learning methods is in environments where interactions appear complex, slow, or human labeling requirements, for example, in many robot tasks. In such an environment, an emulator will allow us to better understand the environment of agent programs, and from there, can lead to new, better and faster ways of doing things. learn multi-tasking enhancement '.

machine learning deep learning artificial intelligence

David Pac

Update 23 May 2019

Related Articles