The Hobbyist to

LocalLLaMA@sh.itjust.worksEnglish · 11 months ago

New Mistral model is out

7

29

New Mistral model is out

The Hobbyist to

LocalLLaMA@sh.itjust.worksEnglish · 11 months ago

7

From Simon Willison: “Mistral tweet a link to a 281GB magnet BitTorrent of Mixtral 8x22B—their latest openly licensed model release, significantly larger than their previous best open model Mixtral 8x7B. I’ve not seen anyone get this running yet but it’s likely to perform extremely well, given how good the original Mixtral was.”

Chat

Audalin@lemmy.world
link
fedilink
English
arrow-up
1·
edit-2
11 months ago
I thought MoEs had to be loaded entirely in the (V)RAM and the inference speedup was because you only need to use a fraction of layers to compute the next token (but the choice of layers can be different for each token, so you need them all ready; or keep moving data between the disk <-> RAM <-> VRAM and get reduced performance).

LocalLLaMA@sh.itjust.works

localllama@sh.itjust.works

You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: [email protected]

Community to discuss about LLaMA, the large language model created by Meta AI.

This is intended to be a replacement for r/LocalLLaMA on Reddit.

Visibility: Public

This community can be federated to other instances and be posted/commented in by their users.

14 users / day
65 users / week
339 users / month
490 users / 6 months
28 local subscribers
2.59K subscribers
255 Posts
1.01K Comments
Modlog