YouTube CEO to OpenAI: Don’t you dare use our videos to train Sora (even though Google trains its AI on our data)

OpenAI CTO Mira Murati didn’t have a good answer when The Wall Street Journal‘s ace technology reporter Joanna Stern asked her on camera in March to identify what data was used to train the company’s text-to-video generation tool Sora. And thanks to new remarks from YouTube CEO Neal Mohan, it’s even more clear now why Murati beat around the bush instead of offering a clear answer.

“From a creator’s perspective, when a creator uploads their hard work to our platform, they have certain expectations,” Mohan said Thursday in an interview with Bloomberg‘s Emily Chang. “One of those expectations is that the terms of service (are) going to be abided by. It does not allow for things like transcripts or video bits to be downloaded, and that is a clear violation of our terms of service. Those are the rules of the road in terms of content on our platform.”

In other words: OpenAI hoovering up tons of YouTube videos to teach Sora what the real world looks like is unacceptable to the search giant.

Sora is OpenAI’s AI-based text-to-video generator. Image source: OpenAI

Now, there’s nothing that makes me happier right now than seeing publishers blocking AI systems from using their content — or, as YouTube’s CEO has done here, going so far as to sternly warn the content bandits at OpenAI to keep their fingers out of the cookie jar. But there’s something else in Mohan’s comments that should also leap out at anyone who’s been following Google’s plan to start stealing traffic from web publishers in favor of a new AI-version of Google Search.

Tech. Entertainment. Science. Your inbox.

By signing up, I agree to the Terms of Use and have reviewed the Privacy Notice.

Does anyone else find it ironic that Google, which has a monopoly chokehold on the Internet search market, is using publisher data to train its search engine and its AI — while at the same time warning OpenAI from doing the same thing with Google’s own YouTube data?

Don’t get me wrong. I’m not cheering either side here. To a certain extent, all that’s going on here are tech giants gobbling up the open web, regurgitating it back to you in a rearranged format, and then pretending that that’s innovation. Of the two companies, OpenAI is clearly the more morally bankrupt, having built its systems on the creation and genius of other people who had no idea they would be inadvertently complicit in the creation of a giant copycat machine.

Which is to say, it’s actually kind of irrelevant for YouTube to snipe at OpenAI over whether the latter has trained its systems on YouTube videos. The source of the theft, in other words, doesn’t make it any less odious. And to those of you who found yourself blown away by this or that hyper-realistic and AI-generated Sora video, I would simply ask you: Have you ever watched a Pixar movie? Computer-generated video is not a new thing.

What is new is that OpenAI’s videos don’t require any humans.

Those Sora videos can be generated by anyone, in seconds, on the basis of a simple text prompt. The AI does all the work for you, and the results indeed are impressive. But no longer needing human input, human creativity, human work to produce these things — that’s pretty much all that’s different between Sora’s videos and a Pixar movie. One was designed by a sophisticated AI, the other by artists who imbued their work with humanity.

In the final analysis, one thing that we all have in common is that we each get only one shot at life and have a finite amount of time to spend on the things we want to do. So it really says a lot about a company and its hundreds of employees when what they choose to do with their time is push the bounds of technology as much as they can, in order to find as many ways as possible to supplant their fellow humans.

Source