• Owl [he/him]@hexbear.net
    link
    fedilink
    English
    arrow-up
    32
    ·
    4 months ago

    LLMs are text prediction engines. They predict what comes after the previous text. They were trained on a large corpus of raw unfiltered internet, because that’s the only thing available that actually has enough data (there is no good training set), then fine-tuned on smaller samples of hand-written and curated question/answer format “as an AI assistant boyscout” text. When the previous text gets too weird for the hand-curated stuff to be relevant to its predictions, it essentially reverts to raw internet. The most likely text to come after weird poorly written horror copypasta is more weird poorly written horror copypasta, so it predicts more, and then it’s fed its previous output and told to predict what comes next, and it spirals into more of that.