• The Hobbyist
    link
    fedilink
    English
    arrow-up
    6
    ·
    7 hours ago

    You can. I’m running a 14B deepseek model on mine. It achieves 28 t/s.

    • levzzz@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      ·
      3 hours ago

      You need a pretty large context window to fit all the reasoning, ollama forces 2048 by default and more uses more memory

    • Viri4thus@feddit.org
      link
      fedilink
      English
      arrow-up
      1
      ·
      5 hours ago

      I also have a 3060, can you detail which framework (sglang, ollama, etc) you are using and how you got that speed? i’m having trouble reaching that level of performance. Thx

        • Jeena@piefed.jeena.net
          link
          fedilink
          English
          arrow-up
          2
          ·
          3 hours ago

          Thanks for the additional information, that helped me to decide to get the 3060 12G instead of the 4060 8G. They have almost the same price but from what I gather when it comes to my use cases the 3060 12G seems to fit better even though it is a generation older. The memory bus is wider and it has more VRAM. Both video editing and the smaller LLMs should be working well enough.