New on LowEndTalk? Please Register and read our Community Rules.
All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.

Comments
Why not try Google Colab, it's free?
pls stop going offtopic
I have a spare PC wondering if I can get this running there:
i5 9600kf
32gb ddr4 ram 2400 MHz
240gb M2 SSD
128gb SSD
1tb HDD
RTX 2060
Would like to know as well if I can upgrade the GPU to get this running.
oh sorry, it was just that people started talking about that so I recommended some
You can probably run 8-16 billion parameters models at ease like the rams good too as well as the vram too, you can try ollama with webui or even lmstudio too
Does not affect you in any way unless you decide to ask questions about those topics. Unlikely.
Like @cainyxues mentioned you'll easily be able to run 7b-14b models on this.
I'm a big fan of ollama and open-webui at the moment, took me all of 30 seconds to get it setup if you use docker-compose
edit: just saw you have a GPU, heres a docker-compose that will utilise that if you want
Didnt think About this yet but i will inform myself and Investigate in the Next weeks/months. Thanks for this Tip!
with 16gb m1 pro, having 200GB/s memory bandwidth i can run deepseek dstill R1 14b, still couldnt get a grip on the limits. chat windows size? max acumulated token?
for tok/sec memory bandwidth is crucial that is what i get from twitter posts
Very interesting, have a ks-le-e and ks-le-b with 64 and 32gb ram sitting mostly idle, would really like to try this for ai tasks that are not realtime so tokens/s is not to important. I will spin up ollama and test a bit. Thanks for getting me on the tought of actually running llm on those servers
pls update with your findings
32b running on a KS-LE-B, its just 11$/m though
3t/s maybe
Anyone tested this with the LE-B with 1245v5 w/ iGPU?
7950X3D with 48GB and 7900xtx can perfect run deepseek 32b,70b answer question so slowly.When i run 32b,it use all GPU memory(24GB) and 40G memory.This is my own pc.
Anyone of you tried the 72b model on the 128GB Kimsufi?
Do you guys think, it will run well on a swap file?
I assume this is deepseek-r1:32b and ran on the e3 1270?
here is 1245 v5 (32gb) for comparison
and 1245 v6 (32gb), maybe slightly faster?
interestingly they all started with the same joke, but atleast the second one was unique
It actually runs on 64gig without a swap file.
But fuck hell, even slower.
Maybe on bigger context sizes, it needs 60GB+
72b running, on a 11$/m dedi, with 15GB to spare, is nuts.
edit: here is the joke:
Why don’t scientists trust atoms?
Because they make up everything! 😄
Just to mess around I installed Ollama on one of the KS-LE-1s that I just received that has 32GB RAM and 2x480GB SSDs.
I have installed the Deepseek R1, Deepseek V3 and Phi4 models.
They work but are extremely slow.
Yea but its private, nobody knows what you ask.
Lets wait for the GAME delivery, should handle up to 32b, given its DDR4 it might be faster than any regular KS.
Still for 11-12$/m steal.
So how much ram is needed for each model?
32b * 1.2, rougly what you need, at least that's what I read.
The 70b just crashed on my 64GB machine.
Wonder that it ran in the first place.
Oh okay. So the 8B model would use about 9.6 GB ram
KS-LE-B with E3-1245 v6 and 32gb of ram:
deepseek-r1:32b Prompt eval: 2.79 t/s Response: 1.34 t/s Total: 1.36 t/sKS-LE-E with E5-1650 v3 and 64gb of ram:
deepseek-r1:32b Prompt eval: 3.24 t/s Response: 1.85 t/s Total: 1.86 t/sSo performance is not fantastic, but honestly for the few bucks a month and them mostly idling anyways I see a use case where tokens/s is not super important like background jobs and such
Have you tried the new DeepSeek R1 Dynamic 1.58-bit that just got released? They achieved an 80% size reduction. I'm interested in how well it can perform on a low/medium-end CPU.
If its on ollama fine, to lazy to compile shit.
edit: seems like with some params, it compiles fine for CPU only.
I wasn't going to install all these crap nvidia dependencies.
Looks like it is on ollama, but minimum VRAM+RAM=80GB, so your low end box probably won't have enough ram to even try it CPU only.