YouTube Will Let Third Parties Train AI Models on User Content

2024 is the year of AI explosion, with a series of large language models (LLMs) being launched and gradually becoming an essential part of many people's technological lives.

Your browser doesn’t support HTML5 audio

2024 is the year of AI explosion, with a series of large language models (LLMs) being launched and gradually becoming an essential part of many people's technological lives.

However, artificial intelligence (AI) companies are struggling to collect high-quality training data. In other words, many companies are "thirsty" for training data for their large-scale AI models. In fact, many major tech companies, including Apple, Nvidia, Salesforce, and Anthrophic, are embroiled in a new controversy regarding AI training data, most notably allegations of using YouTube's vast, rich video content to train AI, which seriously affects digital content copyright issues.

To address these concerns, YouTube will give creators more control over how third-party companies can use their content to train AI. The official statement from Team YouTube reads:

In the coming days, we'll be rolling out an update that will allow creators and video rights holders to choose to let third-party companies use their content to train AI models. This option will appear directly in Studio Settings under 'Third-party training'.

By enabling this feature, creators grant permission for companies like xAI, Apple, Amazon, Anthropic, Meta, Microsoft, Nvidia, OpenAI… to use their videos to train their respective AI models. However, not all videos are eligible. To be 'selected' as AI training data, videos must meet the following conditions:

The copyright holder of the video allows third parties to use the video to train AI.
The video privacy setting is public.
Videos comply with YouTube Terms of Service and Community Guidelines.

But it seems that many people are not happy about big tech companies using their content to train AI models. Take Bluesky users, for example. The social media platform's user community expressed outrage after a machine learning expert released a dataset containing one million posts on Bluesky.

Picture 1 of YouTube Will Let Third Parties Train AI Models on User Content

Many users joined Bluesky to escape platforms like X (formerly Twitter), where Elon Musk's xAI used user posts to train its AI, Grok. They thought they had found a safer space, but the incident made many realize that even on Bluesky, their content could be used without their consent.

In the UK, nearly 40 creative groups, including publishers, authors and photographers, are urging the government to enforce copyright protections as they join a consultation on AI and the creative industries. The Creative Rights in AI Coalition advocates for a licensing market to enable fair use of creative content in generative AI, ensuring that content creators retain control over their work and remuneration.

In August 2024, US artists won a landmark AI copyright case. A district judge ruled that companies like Stability AI, Midjourney, DeviantArt, and Runway AI violated artists' copyrights by using their work without permission to train their own AI models.

Update 24 December 2024

YouTube AI artificial intelligence training data

YouTube Will Let Third Parties Train AI Models on User Content

You should read it

Maybe you are interested

System

Mac OS X

Hardware

Game

Tech info

Technology

Science

Life

Application

Electric

Program

Mobile