OpenAI could be in a ‘clear violation’ of YouTube’s terms of service, CEO says—depending on how it trains its new Sora video tool

Date:

Share:

[ad_1]

It’s well known that OpenAI scrapes vast amounts of data, some of it copyrighted, from the internet to produce the uncannily human-like experience of ChatGPT. The legality of that is still a live question, as lawsuits from The New York Times and others attest to. But how does it train its new video AI program, Sora?

If Sora used content from YouTube it would be a “clear violation” of its terms of service, YouTube CEO Neal Mohan told Bloomberg. 

Mohan was referring to longstanding questions about where AI companies get the content they use to train the model that power their services. While Mohan was sure to say he didn’t know whether OpenAI’s had used YouTube content to develop Sora, he said that would be a problem, if true. 

“From a creator’s perspective, when a creator uploads their hard work to our platform, they have certain expectations,” Mohan said. “One of those expectations is that the terms of service are going to be abided by.” 

Something like having their content scraped from the platform and used by a third party would be a “clear violation of our [terms of service],” Mohan said. 

Downloading videos or transcripts would be an infringement on terms. “Those are the rules of the road in terms of content on our platform,” Mohan said.

A spokesperson for YouTube confirmed its terms of service “prohibit unauthorized scraping or downloading of YouTube content,” without elaborating on Mohan’s comments. OpenAI did not immediately respond to a request for comment. 

OpenAI admitted that it had used copyrighted data to train its AI models, saying it was “impossible” to build the technology without it. The admission came from a filing OpenAI submitted to the British House of Lords when the U.K. government was considering a new law that would limit how AI companies could use copyrighted material. 

More recently, the launch of Sora drew further scrutiny when OpenAI CTO Mira Murati was unable to answer a question about what type of content had been used to train the program and specifically if any from YouTube had been. “I’m actually not sure about that,” Murati told the Wall Street Journal

Murati then added that any data used was publicly available or licensed. Mohan hinted at this interview telling Bloomberg they should ask OpenAI if it had used YouTube data. “I guess they were asked,” Mohan seemed to remember mid-sentence, cutting himself off.  

Further complicating the matter is that YouTube and Google’s parent company, Alphabet, is developing its own suite of AI tools, making it likely that Alphabet is even more concerned a potential rival might be using its content in a way that violates its terms of service. 

“Google wants that data for its own models,” Igor Jablokov, founder and CEO of AI startup Pyron, told Fortune

The AI arms race has already kicked off a gold rush for data. Big AI players like Alphabet, Microsoft, Amazon, and Meta will want to make sure rivals don’t take the data they’ve accumulated. “They’ll all put up walled gardens as terms and conditions,” says Jablokov, whose previous voice recognition startup was instrumental in Amazon’s subsequent creation of Alexa. 

For example, Reddit recently entered into a $60 million a year licensing agreement with Google that would see its content used to train the latter’s AI tools. Media companies have also struck similar deals with AI developers. The Associated Press has a deal with OpenAI that allows its archives to be used for training purposes. While German media company Axel Springer, which owns Business Insider and Politico, has a similar deal that also provides attribution in answers given by ChatGPT.

Subscribe to the Eye on AI newsletter to stay abreast of how AI is shaping the future of business. Sign up for free.

[ad_2]

Source link

Subscribe to our magazine

━ more like this

Fire Watch Guard Duties: What They Actually Do When Safety Is on the Line

If your fire alarm system goes down in a commercial building, you don’t get to wait and see what happens. In most U.S. cities,...

Sports Betting Reddit Trends: What Smart Bettors Are Doing Differently

Introduction Over the past few years, Reddit has become one of the most active platforms for bettors looking to improve their strategies. What started as...

The Rise of Specialist Executive Recruitment Firms in the UK

Finding the right senior leader has never been easy. But in today’s fast-moving UK business environment, it has become even harder. Companies face rapid digital...

Why Non-Executive Directors Are Essential for Strong Governance and Business Growth

Did you know that companies with effective non-executive directors (NEDs) can outperform their competitors by up to 20%? This remarkable statistic underscores the vital...

What Canadian Bettors Look for in a Great Sports Betting Experience

What Canadian Bettors Look for in a Great Sports Betting Experience Sports betting has grown quickly across Canada. From casual fans placing weekend wagers to...