DeepMind's AI model can learn how to create videos just by watching clips on YouTube

DeepMind announced a completely new invention, promising to bring many changes in the field of construction as well as post-video editing in the future.

Perhaps you have heard about FaceApp, a mobile photo editing application that is receiving great attention all over the world with the ability to apply artificial intelligence (AI) to edit selfie photos. Extremely high level of authenticity. Or This Person Does Not Exist, another photo editing application that can create interesting portraits based on computer-generated fictional graphic characters. These are just two of many great applications with the participation of AI in photo editing and creative tasks. So in video editing?

Recently, DeepMind, a subsidiary of Alphabet, which operates primarily in the field of artificial intelligence development, has published a completely new invention called 'Efficient Video Generation on Complex Datasets' (roughly translated: Build Effective video based on complex datasets), promises to bring many changes in the field of construction as well as video post-editing in the future. This is basically an AI algorithm that can learn how to create simple clips from videos that have been accessed during training.

This website can turn your youthful selfie into a classic portrait of ghosts

DeepMind's AI model can learn how to create videos just by watching clips on YouTube Picture 1 DVD-GAN is now able to create sample videos that have full physical composition

Researchers at DeepMind said their current best practice model - Dual Video Discriminator GAN (DVD-GAN) - was able to create videos at a resolution of 256 x 256 pixels, combined with degree Honestly praiseworthy and up to 48 frames in length.

'Creating high-definition, natural-looking videos is the biggest challenge for today's AI models. In particular, the factors causing the most significant obstacles are none other than the complexity in the collection of data sets and calculation requirements. Because of this reason, many of the work related to video creation in the past is often centered around relatively simple data sets, or real-time information availability tasks. Currently, we are focusing on video aggregation and prediction tasks, while targeting the results of today's innovative top-notch AI image to video - complex aspects. much more, 'said the team representative.

The team built their system around an advanced AI architecture, and introduced a number of video-specific tweaks that allowed the training process to be based on Kinetic-600 - one The dataset includes 'natural' videos, which are much larger in scale than usual. Specifically, the researchers took advantage of generative adversarial networks (GANs).

Nvidia only takes 3 weeks to create one of the world's strongest AI supercomputers

DeepMind's AI model can learn how to create videos just by watching clips on YouTube Picture 2 A set of 4-second composite video clips is trained on 12 128 × 128 frames from Kinetic-600.

If you do not know, GAN is an AI system consisting of 2 separate parts: The first is Generative network (network birth), which helps create training samples (fake data), with the goal of how to create training data create the most realistic. And the second is Discriminative network: it is the task of trying to distinguish between real data and fake data. GAN systems have been used in many specialized tasks such as converting captions into stories according to each context, especially creating artificial photos with extremely high authenticity.

DVD-GAN contains dual Discriminative networks: The discrimination algorithm can show the difference in the content and structure of a single frame by randomly sampling full resolution frames, then processing reason them individually and differentiate over time providing learning signals to create movement. A single module - the name Transformer - allows the distribution of data and information learned across the entire AI model.

Google released a huge AI training data warehouse with over 5 million photos of 200,000 locations worldwide

For the Kinetic-600 training data warehouse, this is basically a huge data set, synthesized from over 500,000 high-resolution YouTube clips in no more than 10 seconds. These videos were originally managed to identify human actions, researchers described this data warehouse as 'diverse' and 'unbounded', particularly suitable for training. Open models are similar to DVD-GAN of DeepMind. (In the field of machine learning, there is a term called 'overfitting', which is used to refer to proportionate models that are too close to a specific data set and result in unpredictable observations. future in a reliable way).

According to the report of the research team, after being continuously trained by Google's 3rd generation Tensor Processing Units for 12 to 96 hours, DVD-GAN is now able to create videos themselves. The model possesses full physical composition, movement and even complex structures such as reflections on the river surface, ice rink . DVD-GAN had to 'wrestle' to create complex objects at Higher resolution, in which motion involves a much larger number of pixels. However, the researchers noted that, after being evaluated on UCF-101 (a smaller set of 13,320 videos of human actions), the video samples created by DVD-GAN gained points. Inception Score number is 32.97 - not bad at all.

MIT AI model can capture the relationship between objects with the minimum amount of training data

DeepMind's AI model can learn how to create videos just by watching clips on YouTube Picture 3 Video sample created by DVD-GAN reaches Inception Score 32.97

'In the future, we want to emphasize further the benefits of training general models based on large and complex video data sets, such as Kinetic-600. Although there is still a lot of work to do before actual videos can be consistently created in an unlimited setting range, we believe that DVD-GAN is the perfect stepping stone to show. Realizing this dream, 'said the research team representative.

What do you think about DeepMind's AI DVD-GAN model? Please leave a comment below!

artificial intelligence

David Pac

Update 28 July 2019

« PREV : The Evil Within 2...

What is email... : NEXT »