While ollama does handle multiple different LLMs quite nicely (loading/unloading on demand) there are situations where you may want to run multiple instances of the same model at the same time (e.g. to increase throughput). Here’s how you can do so with minimal changes to your zero effort ollama installation. Let’s assume you just did […]
