They also didn’t seed,
Supposedly, Meta tried to conceal the seeding by not using Facebook servers while downloading the dataset to “avoid” the “risk” of anyone “tracing back the seeder/downloader” from Facebook servers, an internal message from Meta researcher Frank Zhang said, while describing the work as in “stealth mode.” Meta also allegedly modified settings “so that the smallest amount of seeding possible could occur,” a Meta executive in charge of project management, Michael Clark, said in a deposition.
SWIM has a folder of 9GB of books and it’s a lot. This is almost ten thousand times that many.
chuck is very prolific
“Pounded in the butt by the AI graduated from Facebook’s pirate training operation, but not very well compared to the lean efficiency of the pounding provided by DeepSeek with significantly less illegal torrenting, despite the eyepatch and parrot.”
one of my personal favorites from chuck’s january 2025 ouevre