• dinklesplein [any, he/him]@hexbear.net
    link
    fedilink
    English
    arrow-up
    8
    ·
    3 days ago

    source: ML guy i know so this could be entirely unsubstantiated but apparently the main environmental burden of LLM infrastructure comes from training new models not serving inference from already deployed models.

    • tellmeaboutit@lemmygrad.ml
      link
      fedilink
      English
      arrow-up
      1
      ·
      2 days ago

      That might change now that companies are creating “reasoning” models like DeepSeek R1. They aren’t really all that different architecturally but they produce longer outputs which just requires more compute.