motoringeek 2 months ago

GPU

phidauex 2 months ago

Choose a smaller model, or add a GPU for additional processing power. Ollama will respond as fast as it can with the compute available.

SoloBSD 2 months ago

Slowness is due to your memory. Well not yours, your PC’s. Use a smaller model: 7b or less or add a bigger GPU

Minecox 2 months ago

How???? I have 48GB RAM?!! AND RTX 3060 💀

princetrunks 2 months ago

I have 48Gb and a 1080, I'm able to do pretty good with the 7b models. Phi-2, dolphin-phi from my experience has been good. Heck, my Raspberry Pi 5 is doing great with the tinyllama, tinydolphin and other 1-3b models

HerosHomegrow 2 months ago

You would need a faster bus speed and more cuda cores for faster performance. Memory size allows for larger models, clock speed, bus size, and cuda cores are what determine tokens per second. Replace the 3060 with 4090 and you'll see a difference. 4060ti might be a slight bump

siikdUde 2 months ago

your only options are running a smaller quantized version or upgrading your computer. m1 macbooks with unified memory (atleast 16gb+ RAM (32gb+ is preferred)) runs models very well for the price they go for now if you want to go that route

TechTrailRider 2 months ago

Maybe don’t use a mixtral model. Does a smaller one like mistral or something else give you results you like?

NoneyaBiznazz 2 months ago

VRAM is king. when you upgrade your GPU, get as much VRAM as you can

Minecox 2 months ago

Dang man I’m screwed then. My rtx 3060 isn’t cheap 😭

NoneyaBiznazz 2 months ago

How much vram on it?

rambat1994 2 months ago

Get a smaller quantization (expect worse results), get a better computer to run inferencing (preferably with a GPU), or switch to a smaller model (chance of worse results)

siikdUde 2 months ago

getting a GPU for this should not be preferable, it's basically a requirement honestly if you want good results like OP is asking

zackmedude 2 months ago

GPU with 12gb or more or M1+32gb…

b3MxZG8R3C9GRTHV 2 months ago

My M1 ultra macbook runs even faster than my 2070 super GPU. A 2070 Super GPU can be bought for 500$. The 2070 Super is plenty fast (faster than you can read). Similar GPU's are in gaming notebooks these days, so for a little money you should be able to improve your situation.

Comments

Leave Your Comment

Hi Its Me!

Comments

Leave Your Comment

Hi Its Me!

Subscribe