New on LowEndTalk? Please Register and read our Community Rules.
All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
Meta allegedly torrented ~80TB of books to train AI model
From the article:
Last month, Meta admitted to torrenting a controversial large dataset known as LibGen, which includes tens of millions of pirated books. But details around the torrenting were murky until yesterday, when Meta's unredacted emails were made public for the first time. The new evidence showed that Meta torrented "at least 81.7 terabytes of data across multiple shadow libraries through the site Anna’s Archive, including at least 35.7 terabytes of data from Z-Library and LibGen," the authors' court filing said. And "Meta also previously torrented 80.6 terabytes of data from LibGen."
This is wild, even for Meta.
Comments
Can you imagine if there was a tracker that actually had Meta's full resources?
Anyone got an invite?
Did they seed though?
I smell hit&run
Did they get a DMCA-ignored VPS from LET?
what's the torrent client they use? I remember rutoreent/rtorrent cannot handle 10k+ torrents?
Is it allegedly if they admitted it?
All big AI creators are doing similar and making the same legal arguments. OpenAI argued "fair use" as well: https://news.ycombinator.com/item?id=37780199
They may only run afoul the law if they distributed copyrighted content knowingly. It helps in a way that they took measures not to seed. It would be nice for courts to start making good rulings for this new AI tech boom.
so downloading only is legal?
Pirating become legal these days. 🙃
i smell lawsuit.
Basically, depends on your local laws.
If its legal for you, to do a copy from an ilegal source, then yes.
But if you upload it aka seed it, def. not legal.
https://annas-archive.org/
https://annas-archive.org/torrents
Where do you think these fucking documents are from... The lawsuit about this has been going on since 2023.
https://www.courtlistener.com/docket/67569326/kadrey-v-meta-platforms-inc/
The argument I see them making is that they transfer only part of a file and you can't prove that anyone received the entirety from them to have enabled another copy/violation or some shit.
Pretty soon it'll be standard practice if not already.