• The Hobbyist
    link
    fedilink
    English
    arrow-up
    5
    ·
    4 小时前

    You can. I’m running a 14B deepseek model on mine. It achieves 28 t/s.

    • levzzz@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      ·
      28 分钟前

      You need a pretty large context window to fit all the reasoning, ollama forces 2048 by default and more uses more memory

    • Viri4thus@feddit.org
      link
      fedilink
      English
      arrow-up
      1
      ·
      2 小时前

      I also have a 3060, can you detail which framework (sglang, ollama, etc) you are using and how you got that speed? i’m having trouble reaching that level of performance. Thx