What is litellm? litellm is a proxy server you can add between your LLM app and the LLM service. Why use litellm? litellm can be used to configure access, load balancing, accounting and many other things you might be interested in once you’re getting into the LLM business. Installation Install litellm proxy as described here: […]
Tag: Artificial Intelligence
While ollama does handle multiple different LLMs quite nicely (loading/unloading on demand) there are situations where you may want to run multiple instances of the same model at the same time (e.g. to increase throughput). Here’s how you can do so with minimal changes to your zero effort ollama installation. Let’s assume you just did […]
