An Asian MIT student asked AI to turn an image of her into a professional headshot. It made her white with lighter skin and blue eyes.::Rona Wang, a 24-year-old MIT student, was experimenting with the AI image creator Playground AI to create a professional LinkedIn photo.

  • Dojan@lemmy.world
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    You put it really succinctly.

    These models are trained on images generally depicting a lot of concepts. If I do a simple prompt like monkey I’m probably going to get a lot of images of monkeys in a natural setting, because when trained it’s given images essentially stating that this is an image of “monkey, tree, fruit, foliage, green” and so forth. Over time this pattern repeats, and it will start associating one concept with another. Because a lot of pictures that depict monkey also depict foliage, nature and what have you, it makes sense for it to fill in the rest of monkey with foliage, green, nature and less volcano, playful dog, ice cream since those concepts haven’t been very prominent when presented with monkey.

    That is essentially the problem.

    Here is the result for “monkey”

    And here is monkey, volcano, playful dog, ice cream

    The datasets are permeated with these kinds of examples. If you prompt for nurse you’ll get a lot of women in blue/hospitaly clothing, inside hospitals or non-descript backgrounds, and very few men.

    Here’s photo of a nurse

    The more verbose and specific you get though, the likelier it is that you’ll get the outcome you want. Here for example is male (nurse:1) (wearing white scrubs:1) with pink hair skydiving

    This was so outlandish that without the (wearing white scrubs:1) it just put the skydiver in a random pink outfit, even with the added weight on wearing white scrubs it has a tendency to put the subject in something pink. Without the extra weight on (nurse:1) it gave me generic pink (mostly) white men.

    If we were to fix the biases present in our society, we’d possibly see less biases in the datasets, and subsequently less bias in the final models. The issue I believe, isn’t so much a technological one, as it is a societal one. The machines are only racist because we are teaching them to be, by example.

    • rebelsimile@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      2
      ·
      1 year ago

      More to the point, there are so many parameters that can be tweaked, to throw your image into a “generator” without knowing what controls you have, what the prompt is doing, what model it’s using etc is like saying “the internet” is toxic because you saw a webpage that had a bad word on it somewhere.

      I put her actual photo into SD 1.5 (the same model you used) with 30 step Euler, 5.6 cfg and 0.7 noise and got these back. I’d say 3/4 of them are Asian (and the model had a 70% chance to influence that away if it were truly “biased” in the way the article implies), obviously none of them look like the original lady because that’s not how it works. You could generate a literally infinite number of similar-looking women who won’t look like this lady with this technique.

      The issue isn’t so much that the models are biased — that is both tautologically obvious and as mentioned previously, probably preferred (rather than just randomly choosing anything not specified at all times — for instance, your monkey prompt didn’t ask for forest, so should it have generated a pool hall or plateau just to fill something in? The amount of specificity anyone would need would be way too high; people might be generated without two eyes because it wasn’t asked for, for instance); it is that the models don’t reflect the biases of all users. It’s not so much that it made bad choices but that it made choices that the user wouldn’t have made. When the user thinks “person”, she thinks “Asian person” because this user lives in Asia and that’s what her biases toward “person” start with, so seeing a model biased toward people from the Indian subcontinent doesn’t meet her biases. On top of that, there’s a general potential impossibility of having some sort of generic “all people” model given that all people are likely to interpret its biases differently.

      • Dojan@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        1 year ago

        With a much lower denoising value I was able to get basically her but airbrushed. It does need a higher denoising value in order to achieve any sort of “creativity” with the image though, so at least with the tools and “skill” I have with said tools, there’s a fair bit of manual editing needed in order to get a “professional linkedin” photo.