Should be good at conversations and creative, it’ll be for worldbuilding

Best if uncensored as I prefer that over it kicking in when I least want it

I’m fine with those roleplaying models as long as they can actually give me ideas and talk to be logically

  • a2part2
    link
    fedilink
    English
    arrow-up
    1
    ·
    2 days ago

    Can’t you just increase context length at the cost of paging and slowdown?

    • Smokeydope@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      ·
      2 days ago

      At some point you’ll run out of vram memory on the GPU. You make it slower by offloading some memory layers to make room for more context.

      • a2part2
        link
        fedilink
        English
        arrow-up
        1
        ·
        2 days ago

        Yes, but if he’s world building, a larger, slower model might just be an acceptable compromise.

        I was getting oom errors doing speech to text on my 4070ti. I know (now) that I should have for for the 3090ti. Such is life.

    • Pyro@programming.dev
      link
      fedilink
      English
      arrow-up
      1
      ·
      2 days ago

      At a certain point, layers will be pushed to RAM leading to incredibly slow inference. You don’t want to wait hours for the model to generate a single response.