Today, a prominent child safety organization, Thorn, in partnership with a leading cloud-based AI solutions provider, Hive, announced the release of an AI model designed to flag unknown CSAM at upload. It’s the earliest AI technology striving to expose unreported CSAM at scale.

  • Kyrgizion@lemmy.world
    link
    fedilink
    English
    arrow-up
    144
    arrow-down
    2
    ·
    1 month ago

    Not a single peep about false positives.

    I’m sure it won’t be abused though. And if anyone does complain, just get their electronics seized and checked, because they must be hiding something!

    • oldfart@lemm.ee
      link
      fedilink
      English
      arrow-up
      90
      arrow-down
      1
      ·
      1 month ago

      Reminds me of the A cup breasts porn ban in Australia a few years ago, because only pedos would watch that

      • baldingpudenda@lemmy.world
        link
        fedilink
        English
        arrow-up
        50
        ·
        1 month ago

        There was a a porn studio that was prosecuted for creating CSAM. Brazil i belive. Prosecutors claimed that the petite, A-cup woman was clearly underaged. Their star witness was a doctor who testified that such underdeveloped breasts and hips clearly meant she was still going through puberty and couldn’t possible be 18 or older. The porn star showed up to testify that she was in fact over 18 when they shot the film and included all her identification including her birth certificate and passport. She also said something to the effect of women come in all shapes and sizes and a doctor should know better.

        I can’t find an article. All I’m getting is GOP trump pedo nominees and brazil laws on porn.

        • Scratch@sh.itjust.works
          link
          fedilink
          English
          arrow-up
          49
          ·
          1 month ago

          Not to mention the self image impact such things would have on women with smaller breasts, who (as I understand it) generally already struggle with poor self image due to breast size.

          • Clinicallydepressedpoochie@lemmy.world
            link
            fedilink
            English
            arrow-up
            24
            ·
            1 month ago

            If this is the price I must pay, I will pay it, sir! No man should be deprived of privately viewing a consenting adults perfectly formed small tit’s. They can take my liberty, they can take my livelihood, but they will never take away my boner for puffy nipples on a small chested half Japanese woman!

      • JackbyDev@programming.dev
        link
        fedilink
        English
        arrow-up
        17
        ·
        1 month ago

        This sort of rhetoric really bothers me. Especially when you consider that there are real adult women with disorders that make them appear prepubescent. Whether that’s appropriate for pornography is a different conversation, but the idea that anyone interested in them is a pedophile is really disgusting. That is a real, human, adult woman and some people say anyone who wants to live them is a monster. Just imagine someone telling you that anyone who wants to love you is a monster and that they’re actually protecting you.

      • AmidFuror@fedia.io
        link
        fedilink
        arrow-up
        4
        ·
        1 month ago

        Australia has a more general ban on selling or exhibiting hard porn, but is is legal to possess it. So it’s not just small boobs.

    • JackbyDev@programming.dev
      link
      fedilink
      English
      arrow-up
      16
      ·
      1 month ago

      It could also, of course, make mistakes, but Kevin Guo, Hive’s CEO, told Ars that extensive testing was conducted to reduce false positives or negatives substantially. While he wouldn’t share stats, he said that platforms would not be interested in a tool where “99 out of a hundred things the tool is flagging aren’t correct.”

      I take this to mean it is at least 1% accurate lol.

    • Erasmus@lemmy.world
      link
      fedilink
      English
      arrow-up
      80
      arrow-down
      8
      ·
      1 month ago

      Just remember folks. Kutcher is a slimeball too.

      The guy went from a D list star and hanging out with the likes of Danny Masterson and going to Diddy’s infamous parties - to suddenly overnight courting the US government and being the face of ‘helping’ children everywhere.

      Yeah right……

      • chonglibloodsport@lemmy.world
        link
        fedilink
        English
        arrow-up
        30
        arrow-down
        5
        ·
        1 month ago

        I’d be wary of calling him guilty by association. Maybe when he realized who he was really hanging out with he was so horrified and disgusted that he just had to get involved and do something to fight back?

        • Erasmus@lemmy.world
          link
          fedilink
          English
          arrow-up
          19
          arrow-down
          10
          ·
          1 month ago

          It’s awful coincidental that he seems to hang out with the ‘rapist’ crowd. Even going as far as writing a letter for Masterson as to how nice of a guy he is to try to get him a lenient sentence.

          Even Hollywood has ostracized him and his wife - news sites recently reported they were looking to leave the country and let things cool off for a while.

          I’m sure everyone is right though that keep posting here, that he is a swell guy who was just in the wrong place at the wrong time, multiple times. Several years worth of multiple times with wrong people. Just a coincidence.

          • BassTurd@lemmy.world
            link
            fedilink
            English
            arrow-up
            23
            arrow-down
            7
            ·
            1 month ago

            The difference between us giving him a benefit of the doubt and claiming innocence and your take, is that you are labeling him a pedophile without proof. That’s a significant claim if false, and imo takes an assumption too far. Maybe he’s bad and it should be looked into, but saying he did something because he was on a show with and good friends with a guy that happened to be a rapist is wrong.

        • Phoenixz@lemmy.ca
          link
          fedilink
          English
          arrow-up
          1
          ·
          30 days ago

          Nah, it’s much easier to chastise people for not knowing what nobody knew

      • Nine@lemmy.world
        link
        fedilink
        English
        arrow-up
        29
        arrow-down
        4
        ·
        1 month ago

        People can grow and change. Not saying he did or didn’t. Just saying that people aren’t a monolith. It’s plausible he just grew and his views changed / evolved.

        That being said, it’s highly convenient where he’s positioned himself these days…

      • NeuronautML@lemmy.ml
        link
        fedilink
        English
        arrow-up
        2
        ·
        edit-2
        30 days ago

        Wasn’t he also featured in a video about how he couldn’t wait until Hillary Duff and the Olsen twins turned 18 because he wanted to date them when they were like 15 ?

  • db0@lemmy.dbzer0.com
    link
    fedilink
    English
    arrow-up
    49
    ·
    edit-2
    1 month ago

    It’s the earliest AI technology striving to expose unreported CSAM at scale.

    horde-safety has been out for a year now. Just saying… It’s not a trained AI model in this way, but it’s still using Neural Networks (i.e. “AI Technology”)

  • hendrik@palaver.p3x.de
    link
    fedilink
    English
    arrow-up
    26
    arrow-down
    2
    ·
    1 month ago

    And will we get that technology to keep the Fediverse and free platforms safe? Probably not. All the predecessors have been kept away for sole use of the big players, despite populism always claiming we need to introduce total surveillance to keep the children safe…

    • Riskable@programming.dev
      link
      fedilink
      English
      arrow-up
      16
      arrow-down
      2
      ·
      1 month ago

      I was going to say… Sure would be nice to have this feature in all the open source AI image generator tools but you’re absolutely right 😩

      • hendrik@palaver.p3x.de
        link
        fedilink
        English
        arrow-up
        11
        ·
        1 month ago

        Yeah, unless someone publishes even a set of hashes of known bad content for the general public… I kind of doubt the true intentions are preventing CSAM to the benefit of everyone.

    • db0@lemmy.dbzer0.com
      link
      fedilink
      English
      arrow-up
      4
      ·
      1 month ago

      IFTAS is already working with Thorn towards this goal. But you already have access to such technology through my toolset.

      • hendrik@palaver.p3x.de
        link
        fedilink
        English
        arrow-up
        2
        ·
        edit-2
        1 month ago

        This one? I loosely followed your work… Maybe I should try it someday. See how it does on a regular VPS. Thanks for the link to the IFTAS. Seems they have curated some useful links… I’ll have a look at their articles. Hope they get somewhere with that. At this point, I don’t think there is any blocklist accessible to the average Fediverse admin?!

        Edit: Thx, saw your other comment with the link to horde-safety.

        • db0@lemmy.dbzer0.com
          link
          fedilink
          English
          arrow-up
          2
          ·
          1 month ago

          Ye, a normal VPS would be too slow for production use, as a GPU is recommended. But you can plug in any home PC to do it without risks

          • hendrik@palaver.p3x.de
            link
            fedilink
            English
            arrow-up
            1
            ·
            edit-2
            1 month ago

            Do you think this approach would be worth a try for the threaded Fediverse (aka Lemmy)? I mean your use-case is very different. We have some rudimentary image detection to flag other kinds of unwanted images in Piefed. I could experiment with something like https://github.com/monatis/clip.cpp. Have it go through the media cache and see if it can do something useful for us. But I don’t think it’d be worth all the effort unless the whole approach is somewhat accurate and runs in real time on average VPSes.

            • db0@lemmy.dbzer0.com
              link
              fedilink
              English
              arrow-up
              3
              ·
              edit-2
              1 month ago

              This approach was developed precicely for threaded fediverse. The initial use-case was protecting my own lemmy from CSAM! Check out fedi-safety and pictrs-safety

    • BetaDoggo_@lemmy.world
      link
      fedilink
      English
      arrow-up
      3
      arrow-down
      2
      ·
      edit-2
      1 month ago

      If everyone has access to the model it becomes much easier to find obfuscation methods and validate them. It becomes an uphill battle. It’s unfortunate but it’s an inherent limitation of most safeguards.

      • hendrik@palaver.p3x.de
        link
        fedilink
        English
        arrow-up
        2
        ·
        1 month ago

        You’re probably right. I’m not sure if it’s a good idea to walk close to the edge with things like this, though. Every update to the detection model could change things and get them in jail… So I certainly wouldn’t play a cat and mouse game with something that has several years of jailtime attached… But then I don’t really know the thought process of the average pedo. And AI image detection comes with problems anyways. In the article they say it detected 6 million pictures already. While keeping quiet about the rate of false positives. We know people have gotten in serious trouble for (false) claims. And I also wouldn’t want to be the Fediverse admin who has to go through thousands of flagged pictures and look at them and decide which is which. With consequences attached… Maybe a database of hashes would be the only option. That doesn’t detect new pictures, but at the same time it comes without flase positives and you can’t draw conclusions from hash values.

  • floofloof@lemmy.ca
    link
    fedilink
    English
    arrow-up
    19
    arrow-down
    4
    ·
    1 month ago

    This seems like a potential actual good use of AI. Can’t have been much fun to train it though.

    And is there any risk of people turning these kinds of models around and using them to generate images?

    • Jimbabwe@lemmy.world
      link
      fedilink
      English
      arrow-up
      35
      arrow-down
      11
      ·
      1 month ago

      If AI was reliable, maybe. MAYBE. But guess what? It turns out that “advanced autocomplete” does a shitty job of most things, and I bet false positives will be numerous.

      • pearsaltchocolatebar@discuss.online
        link
        fedilink
        English
        arrow-up
        6
        arrow-down
        1
        ·
        1 month ago

        It’s possible to have a good AI system, but it takes millions of dollars and several thousand manhours to do, and most companies won’t put in the effort.

        But, there should always be a human in the loop.

      • AwesomeLowlander@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        4
        arrow-down
        3
        ·
        1 month ago

        “detect new or previously unreported CSAM and child sexual exploitation behavior (CSE), generating a risk score to make human decisions easier and faster.”

        False positives don’t matter if they stick to the stated intended purpose of making it easier to detect CSAM manually.

        • Voroxpete@sh.itjust.works
          link
          fedilink
          English
          arrow-up
          11
          ·
          edit-2
          1 month ago

          The problem is that they won’t.

          Yes, AI tools, in the hands of skilled people, can be very helpful.

          But “AI” in capitalism doesn’t mean “more effective workers”, it means “fewer workers.” The issue isn’t technological so much as cultural. You fundamentally cannot convince an MBA not to try to automate away jobs.

          (It’s not even a money thing; it’s about getting rid of all those pesky “workers rights” that workers like to bring with us)

          • AwesomeLowlander@sh.itjust.works
            link
            fedilink
            English
            arrow-up
            2
            ·
            1 month ago

            Here’s the thing. This technology is unequivocally one of the things AI would be very useful for. It can potentially do a lot of good. Yes, MBAs could screw it up like they screw anything else up in society. That doesn’t mean we shouldn’t be happy that we’ve created this new tech.

    • FaceDeer@fedia.io
      link
      fedilink
      arrow-up
      14
      ·
      1 month ago

      And is there any risk of people turning these kinds of models around and using them to generate images?

      There isn’t really much fundamental difference between an image detector and an image generator. The way image generators like stable diffusion work is essentially by generating a starting image that’s nothing but random static and telling the generator “find the cat that’s hidden in this noise.”

      It’ll probably take a bit of work to rig this child porn detector up to generate images, but I could definitely imagine it happening. It’s going to make an already complicated philosophical debate even more complicated.

    • mspencer712@programming.dev
      link
      fedilink
      English
      arrow-up
      9
      arrow-down
      1
      ·
      1 month ago

      I think image generators in general work by iteratively changing random noise and checking it with a classifier, until the resulting image has a stronger and stronger finding of “cat” or “best quality” or “realistic”.

      If this classifier provides fine grained descriptive attributes, that’s a nightmare. If it just detects yes or no, that’s probably fine.

    • catloaf@lemm.ee
      link
      fedilink
      English
      arrow-up
      7
      ·
      1 month ago

      Nobody would have been looking directly at the source data. The FBI or whoever provides the dataset to approved groups, but after that you just say “use all the images in this folder” and it goes. But I don’t even know if they actually provide real full-resolution images, or just perceptual hashes, or downsampled images.

      And while it’s possible to use the dataset to generate new images assuming the training data had full-res images, like I said, I know they investigate the people making the request before allowing access. And access is probably supervised and audited.

    • Hoimo@ani.social
      link
      fedilink
      English
      arrow-up
      3
      ·
      1 month ago

      Available image generators are already capable of generating those images and they weren’t even trained on it. Once a neural network can detect/generate two separate concepts, it can detect/generate the overlap. It won’t be as fine-tuned obviously, but can still turn out scarily accurate.

  • sexual_tomato@lemmy.dbzer0.com
    link
    fedilink
    English
    arrow-up
    19
    arrow-down
    5
    ·
    edit-2
    1 month ago

    Jesus Christ. If someone ever got their hands on this model they could use it to generate new material. The grossest possible AI model to date

    • Todd Bonzalez@lemm.ee
      link
      fedilink
      English
      arrow-up
      30
      arrow-down
      3
      ·
      1 month ago

      No. This is an inference model, not a generative model. You generally cannot train a model for both, unless you do it on purpose, and they certainly did not (especially since inference models are way easier to train than generative models).

      • sexual_tomato@lemmy.dbzer0.com
        link
        fedilink
        English
        arrow-up
        8
        arrow-down
        2
        ·
        edit-2
        1 month ago

        A generative model uses the classifier as part of its training. If you generate a picture of pure random noise, then iteratively pick random noise that the classifier says “looks” more like csam, then you can effectively generate images that the classifier says it’s 100% certain is csam. Whether or not that looks anything like what a human would consider to be csam depends on other factors but it remains a possibility.

        • Todd Bonzalez@lemm.ee
          link
          fedilink
          English
          arrow-up
          10
          ·
          30 days ago

          You are describing the way deepdream works, not the way modern Diffusion models work. It’s the difference between psychedelic dog faces and a highly adherent generative image of a German Sheppard.

          I can’t imagine you’re going to get anything out of this model that actually looks like CSAM, unless there’s some sort of breakthrough in using these models for previously unrealized generative purposes.

    • Kbobabob@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      arrow-down
      1
      ·
      1 month ago

      I thought being able to do that was already a thing. This is designed to do the opposite.

      I know, I know… bad actors and such.

      • NauticalNoodle@lemmy.ml
        link
        fedilink
        English
        arrow-up
        1
        ·
        edit-2
        1 month ago

        …but if simple posession defines who a bad actor is…

        The irony of this never ceases to amaze me.

  • Nurse_Robot@lemmy.world
    link
    fedilink
    English
    arrow-up
    11
    arrow-down
    2
    ·
    1 month ago

    This is a great development, albeit with a lot of soul crushing development behind it I assume. People who have to look at CSAM or whatever the acronym is have a miserable job, so I’m very supportive of trying to automate that away from people.

    • atomicorange@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      arrow-down
      1
      ·
      edit-2
      1 month ago

      Yeah, I’m happy for AI to take this particular horrifying job from us. Chances are it will be overtuned (too strict), but if there’s a reasonable appeals process I could see it saving a lot of people the trauma of having to regularly view the worst humanity has to offer without major drawbacks.

  • Churbleyimyam@lemm.ee
    link
    fedilink
    English
    arrow-up
    11
    arrow-down
    9
    ·
    1 month ago

    I think all CSAM should be destroyed out of respect for the victims, not proliferated. I don’t care who is hanging onto this material or for what purpose.

    • Ghostie21@lemmy.world
      link
      fedilink
      English
      arrow-up
      17
      arrow-down
      2
      ·
      1 month ago

      How is this proliferating csam? Also, how do you expect them to find csam without having known images? It gives a really nice way to check based on hashes without having someone look at every picture on someone’s harddrive. With this AI it should greatly help determining new or unknown images while minimizing the number of actual people that have to see that stuff, and who get scarred from looking at such images. The only reason to be against this is if you are looking at CP and want it to be harder to find, or if you don’t understand how this technology is being used.

      • Churbleyimyam@lemm.ee
        link
        fedilink
        English
        arrow-up
        2
        ·
        29 days ago

        How is this proliferating csam?

        Sharing it with people and companies that it wasn’t being shared with before.

        Also, how do you expect them to find csam without having known images?

        The same way it is now: people reporting it and undercover police accounts. People recognise it.

        without having someone look at every picture on someone’s harddrive

        If it’s going to get used as evidence in court a human will have to review and confirm it. I don’t think “Because the AI said so” is going to convince juries.

        The only reason to be against this is if you are looking at CP

        Or if it’s you or someone you love who is in the CP. Having further copies of it on further hard drives, whether it’s so someone can bake it into their AI tool or any other purpose is wrong. That’s just my view though.

        • Ghostie21@lemmy.world
          link
          fedilink
          English
          arrow-up
          2
          ·
          29 days ago

          Sorry I cannot post a longer response but I’d suggest you look up how this type of forensic software is developed and used. There are a few good documentaries on it if you look, one I remember watching was on googles team for this stuff.

          The images are not exactly shared in that very few people have access to them, and they treat it very much like classified information so that only select people can see them.

          These models would be developed using normal images and then trained in closed systems with the real images where the accuracy is used and not the images. No need to scar the developers who just want to work.

          Nothing about the reporting of people will change, the only difference is this will allow the FBI to have a list of suspected CP and a list of normal images from a computer allowing them to spend a fraction of the time looking at this stuff to document it. This is very important when you have people who have literally terrabytes of the stuff and probably even more normal images. In general we like to minimize the time spent looking at such stuff because it is so scarring.

          As for showing the images in court, in the US hashes are acceptable evidence, again we don’t like to scar people by showing them this stuff. Additionally after you’ve been shown the 100th picture of a baby being abused and the FBI is telling you they have 1000000 more, you’ll just take their word for it.

          Anyways, hope you have a good one

    • xionzui@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      11
      arrow-down
      1
      ·
      1 month ago

      Uh, well this one tells you if an image looks like it or not. It doesn’t generate images

        • melroy@kbin.melroy.org
          link
          fedilink
          arrow-up
          5
          arrow-down
          4
          ·
          1 month ago

          Correct, this kind of software is trained on CP data. So such models can be easily used to generate CP instead of recognizing it, which makes them very dangerous indeed.

          Same idea as the current models that are trained to recognized cars, these models can also be used to generate a car from noise as a starting poiint.

          • xionzui@sh.itjust.works
            link
            fedilink
            English
            arrow-up
            6
            arrow-down
            1
            ·
            1 month ago

            In pretttty sure you can’t just run it in reverse like that. There’s a whole different training and operation methodology you have to use to support generating images rather than simple text classification

            • JackbyDev@programming.dev
              link
              fedilink
              English
              arrow-up
              2
              ·
              1 month ago

              There is a method of training where you use one system to make things and another to detect them. I forget the name of this approach, but it definitely is an approach.

    • Railcar8095@lemm.ee
      link
      fedilink
      English
      arrow-up
      9
      arrow-down
      5
      ·
      1 month ago

      It differs in basically being something completely different. This is a classification model, doesn’t have generative capabilities. Even if you were to get the model and it’s weights, and you tried to reverse engineer an “input” that it would classify as CP, it would most likely look like pure noise to you.

      Moron

        • Railcar8095@lemm.ee
          link
          fedilink
          English
          arrow-up
          3
          arrow-down
          3
          ·
          1 month ago

          So you need to have a model that generates CP to begin with. Flawless reasoning there.

          Look, it’s clear you have no clue what you’re talking about. Stop demonstrating it, moron.

          • JackbyDev@programming.dev
            link
            fedilink
            English
            arrow-up
            1
            ·
            30 days ago

            Alright, I found the name of what I was thinking of that sounds similar to what they’re suggesting: generative adversarial network (GAN).

            The core idea of a GAN is based on the “indirect” training through the discriminator, another neural network that can tell how “realistic” the input seems, which itself is also being updated dynamically. This means that the generator is not trained to minimize the distance to a specific image, but rather to fool the discriminator. This enables the model to learn in an unsupervised manner.

            • Railcar8095@lemm.ee
              link
              fedilink
              English
              arrow-up
              3
              arrow-down
              1
              ·
              30 days ago

              Applying GAN won’t work. If used for filtering would result on results being skewed to a younger, but it won’t show 9 the body of a 9 year old unless the model could do that from the beginning.

              If used to “tune” the original model, it will result on massive hallucination and aberrations that can result in false positives.

              In both cases, decent results will be rare and time consuming. Anybody with the dedication to attempt this already has pictures and can build their own model.

              Source: I’m a data scientist

          • JackbyDev@programming.dev
            link
            fedilink
            English
            arrow-up
            1
            arrow-down
            1
            ·
            1 month ago

            The model I use (I forget the name) popped out something pretty sus once. I wouldn’t describe it as CP, but it was definitely weird enough to really make me uncomfortable. It’s the only thing it ever made that I immediately deleted and removed from the recycling bin too lol.

            The point I’m making is that this isn’t as far fetched as you believe.

            Plus, you can merge models. Get a general purpose model that knows what children look like, a general purpose pornographic model, merge them, then start generating and selecting images based on Thorn’s classifier.

            • Railcar8095@lemm.ee
              link
              fedilink
              English
              arrow-up
              2
              ·
              1 month ago

              You can’t merge a generative model and a classification model. You can run then in series to get a bunch of false positives/hallucinations, but you can’t make it generate something from the other model.

              • JackbyDev@programming.dev
                link
                fedilink
                English
                arrow-up
                1
                ·
                30 days ago

                When I said a “general purpose model that knows what children look like” I didn’t mean the classification model from the article. I meant a normal, general purpose image generation model. When I said “that knows what children look like” I mean part of its training set is on children, because it’s sort of trained a little on everything. When I said “pornographic model” I mean a model trained exclusively on NSFW content (and not including any CSAM, but that may be generous depending on how much care was out into the model’s creation).