Did that AI system use Doja Cat records for training data?

  • onlinepersona@programming.dev
    link
    fedilink
    English
    arrow-up
    4
    arrow-down
    1
    ·
    edit-2
    1 month ago

    It’s dead easy. Yet github didn’t do it when training copilot and are now sued because of it.

    It is also easy to build a database of copyrighted material and check that revealed training data marches it. The copyright licence doesn’t necessarily need to be attached. It just makes it easier to spot.

    Also, what are you arguing here? That because copyright is easy to ignore, it should be or that it’s pointless? Is that the advice you’d give anybody else too? “You know what Disney, everyone ignores copyright, so why not make everything public domain?”

    Anti Commercial-AI license