• stephen01king
    link
    fedilink
    English
    arrow-up
    4
    arrow-down
    2
    ·
    8 months ago

    How constantly does it spit out copyrighted material? Is there data on that?

    • buffaloseven@fedia.io
      link
      fedilink
      arrow-up
      2
      ·
      8 months ago

      There’s more and more research starting to happen on it, but I’ve seen anywhere from 20% to 60% of responses. Here’s a recent study where they explicitly try to coerce LLMs to break copyright: https://www.patronus.ai/blog/introducing-copyright-catcher

      I don’t have the time to grab them right now, but in many of the lawsuits brought forward against companies developing LLMs, their openings contain some statistics gathered on how frequently they infringed by returning copyrighted material.