So, the New York Times has thrown down the gauntlet, suing OpenAI and Microsoft for copying millions of its articles. They’re saying these tech giants used their AI to train ChatGPT and similar services, creating competition for the Times itself.
Big Fight, Big Players
This lawsuit is the latest in a series aiming to stop what’s called scraping – basically, grabbing tons of online content without paying for it – to train those fancy language AI models. People like actors, writers, and journalists worry that AI will learn from their work and then churn out chatbots and info sources without giving them fair dues.
The Times is the first major news outlet taking on OpenAI and Microsoft, two big names in the AI world. Microsoft even sits on OpenAI’s board and has chucked billions into the company.
In their complaint filed Wednesday, the Times said this use of their work messes with their job of keeping subscribers informed. They claim that OpenAI and Microsoft didn’t just copy from various sources – they gave extra attention to the Times’ content. They’re calling it a way to piggyback on the Times’ hard work in journalism without asking or paying.
The Battle Begins
OpenAI spoke up, saying they respect content owners and want to work with them. They even mentioned having good talks with the Times, so they’re feeling surprised and let down by this lawsuit. Microsoft, though, didn’t give a comment about the whole shebang.
The Times wasn’t sitting quietly. They say they raised their objections months back when they found out their stuff was used to train these companies’ big language models. From April, they’ve been trying to chat with OpenAI and Microsoft, asking for fair pay and some deal. But it looks like they couldn’t find common ground.
“Fair Use”? Not So Fast
Now, Microsoft and OpenAI claim they used the Times’ work under “fair use” rules, which supposedly allows using copyrighted stuff for a different purpose. But the Times isn’t buying it. They argue that ChatGPT and Microsoft’s Bing chatbot do a similar job to what the Times does. Copying their work without payment isn’t fair game, according to them.
Pushing Back Against AI
The Times isn’t alone in this. Other newsrooms, like, put a stopper on OpenAI’s web crawler that scans their sites for content.
Earlier this year, comedian Sarah Silverman and a couple of authors sued Meta and OpenAI, claiming these AI models were trained on their book materials without their say-so. No word from either company about the lawsuit, though. A judge mostly tossed out those claims in November.
The Allegations and Impact
The Times is dropping some big claims. They say the datasets used to train OpenAI’s recent big language models likely had millions of their articles. In one dataset called Common Crawl, the Times’ site is up there as the third most-used source, right after Wikipedia and US patent docs.
Taking a Stand
The Times knows AI’s the future, but they’re saying, “Hey, play fair!” Their top lawyer told staff that they dig AI’s potential but don’t want it stepping on their toes. They’re after billions in damages but didn’t put a number on the table yet. Plus, they’re gunning for a court order to stop OpenAI and Microsoft from carrying on like this and even want any AI models using their stuff to be destroyed.