“CSAM generated by AI is still CSAM,” DOJ says after rare arrest

jeffw@lemmy.world · 10 months ago

“CSAM generated by AI is still CSAM,” DOJ says after rare arrest

solrize@lemmy.world · edit-2 10 months ago

And don’t understand how generative AI combines existing concepts to synthesize images - it doesn’t have the ability to create novel concepts.

Imagine someone asks you to shoop up some pr0n showing Donald Duck and Darth Vader. You’ve probably never seen that combination in your “training set” (past experience) but it doesn’t exactly take creating novel concepts to fulfill the request. It’s just combining existing ones. Web search on “how stable diffusion works” finds some promising looking articles. I read one a while back and found it understandable. Stable Diffusion was the first of these synthesis programs but the newer ones are just bigger and fancier versions of the same thing.

Of course idk what the big models out there are actually trained on (basically everything they can get, probably not checked too carefully) but just because some combination can be generated in the output doesn’t mean it must have existed in the input. You can test that yourself easily enough, by giving weird and random enough queries.

xmunk@sh.itjust.works · 10 months ago

No, you’re quite right that the combination didn’t need to exist in the input for an output to be generated - this shit is so interesting because you can throw stuff like “A medieval castle but with Iranian architecture with a samurai standing on the ramparts” at it and get something neat out. I’ve leveraged AI image generation for visual D&D references and it’s excellent at combining comprehended concepts… but it can’t innovate a new thing - it excels at mixing things but it isn’t creative or novel. So I don’t disagree with anything you’ve said - but I’d reaffirm that it currently can make CSAM because it’s trained on CSAM and, in my opinion, it would be unable to generate CSAM (at least to the quality level that would decrease demand for CSAM among pedos) without having CSAM in the training set.

solrize@lemmy.world · 10 months ago

it currently can make CSAM because it’s trained on CSAM

That is a non sequitur. I don’t see any reason to believe such a cause and effect relationship. The claim is at least falsifiable in principle though. Remove whatever CSAM found its way into the training set, re-run the training to make a new model, and put the same queries in again. I think you are saying that the new model should be incapable of producing CSAM images, but I’m extremely skeptical, as your medieval castle example shows. If you’re now saying the quality of the images might be subtly different, that’s the no true Scotsman fallacy and I’m not impressed. Synthetic images in general look impressive but not exactly real. So I have no idea how realistic the stuff this person was arrested for was.