I just recently started working on an llm implementation on my laptop. Iteration is slow, since each test run can take up to an hour. How low can I go, as far as GPUs are concerned? I’m assuming ollama uses CUDA, as well, right? I’m aware of some stuff that uses vulkan for acceleration, which would open me up to be able to use an AMD card, but I don’t know if ollama is like that.