• Aniki 🌱🌿@lemm.ee
    link
    fedilink
    English
    arrow-up
    9
    arrow-down
    1
    ·
    edit-2
    7 months ago

    I haven’t super looked into it but I’m not interested in playing the GPU game against the gamers so if AMD can do a Tesla equivalent with gobs of RAM and no display hardware I’d be all about it.

    Right now it’s looking like I’m going to build a server with a pair of K80s off ebay for a hundred bucks which will give me 48GB of RAM to run models in.

    • dublet@lemmy.world
      link
      fedilink
      English
      arrow-up
      4
      ·
      7 months ago

      Some of the LLMs it ships with are very reasonably sized and still be impressive. I can run them on a laptop with 32GB of RAM.

      • Aniki 🌱🌿@lemm.ee
        link
        fedilink
        English
        arrow-up
        1
        arrow-down
        1
        ·
        7 months ago

        This is very interesting! Thanks for the link. I’ll dig into this when I manage to have some time.

    • tal@lemmy.today
      link
      fedilink
      English
      arrow-up
      4
      ·
      edit-2
      7 months ago

      if AMD can do a Tesla equivalent with gobs of RAM and no display hardware I’d be all about it.

      That segment of the market is less price-sensitive than gamers, which is why Nvidia is demanding the prices that they are for it.

      An Nvidia H100 will give you 80GB of VRAM, but you’ll pay $30,000 for it.

      AMD competing with Nvidia in the sector more-strongly will improve pricing, but I doubt very much that it’s going to make compute cards cheaper than GPUs.

      Besides, if you did wind up with compute cards being cheaper, you’d have gamers just rendering frames on compute cards and then using something else to push the image to the screen. I know that Linux can do that with PRIME, and I assume that Windows can as well. That’d cause their attempt to split the market by price to fail. Nah, they’re going to split things up by amount of VRAM on the card, not by whether there’s a video interface on it.

      I suspect that a better option is to figure out ways to reasonably split up models to run on lower-VRAM GPUs in parallel.