Sylovik@lemmy.worldtoLocalLLaMA@sh.itjust.works•How much gpu do i need to run a 90b modelEnglish
4·
17 hours agoIn case of LLM’s you should look at AirLLM. I suppose there is no conviniet integrations to local chat tools, but issue at Ollama already started.
You may try Harbor. The description claims to provide an OpenAI-compatible API.