Open Source, Linux, IT Stuff

Tag: KV cache

Artificial Intelligence Large Language Model (LLM)

LLM VRAM usage with llama.cpp

Post author By Marcel
Post date 29.04.2026
No Comments on LLM VRAM usage with llama.cpp

While trying to get some of Qwen’s latest models up and running on my AMD iGPU I encountered some crashes. The errors were misleading, but in the end it turned out to be out of memory errors, so I started to think about how much memory different components of a LLM use. The data in […]

Tags KV cache, llama.cpp, LLM, Qwen, VRAM