Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!


VPS with 16 vCPUs and 8GB DDR4/5?
New on LowEndTalk? Please Register and read our Community Rules.

All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.

VPS with 16 vCPUs and 8GB DDR4/5?

I'm looking for a high-compute VPS to run a Large Language Model, (AI chatbot, text classifier etc).

I already have a heavily quantised model running fairly well on a VPS with 8 Epyc vCPUs and 16GB of RAM, but it only uses 5.5GB and its bottleneck is compute so I'm looking for an optimised solution*

Storage and bandwidth requirements are minimal, so the specs I'm looking for are something like:

Compute: 16x fast vCPUs, (or maybe more?)
Ram: 8GB DDR4/5
Storage: 50GB NVMe
Bandwidth: 1TB - 100mbps
IPv4 - optional

Looking for offers at around €20 p/m, and open to Quarterly payments if it sweetens the deal.

*I know I should use a GPU but this is LET 🤷‍♀️

Comments

  • sh97sh97 Member
    edited March 29

    You can build your own config with @crunchbits

    https://crunchbits.com/vps

    Your required specs come around 15$
    That said, you'd probably be better off with a VDS.

    Thanked by 1zGato
  • davidedavide Member
    edited March 29

    With the latest llama and this model:

    -rw-r--r-- 1 user user  22G Sep 20  2023 WizardLM-Uncensored-SuperCOT-Storytelling.Q5_K_M.gguf
    

    I get about 1 token per second on an Intel E3-1245 v3. Cost of cpu + motherboard + memory: 100€, hosted in basement. Just saying.

    This is the piece on eBay (no aff), you can ask him to dump VAT and eBay fees too.

  • @sh97 said:
    You can build your own config with @crunchbits

    https://crunchbits.com/vps

    Your required specs come around 15$
    That said, you'd probably be better off with a VDS.

    Oh wow, I could get 32 x vCPUs and 8GB of RAM for $19 a month! 😲

    Looks to be sold out at the moment, but I'll keep a close eye on their stock because I didn't think a CPU/RAM ratio of 4/1 would be possible.

    A VDS would make sense if I was going to use it enough, but for now I'm only experimenting so I'm looking to do it as cheaply as possible.

    Very much appreciate the tip about @crunchbits though 👍

  • dev_vpsdev_vps Member

    @CloudHopper said:
    I'm looking for a high-compute VPS to run a Large Language Model, (AI chatbot, text classifier etc).

    Go for Ryzen based dedicated VDS or a physical server.

  • @davide said:
    With the latest llama and this model:

    -rw-r--r-- 1 user user  22G Sep 20  2023 WizardLM-Uncensored-SuperCOT-Storytelling.Q5_K_M.gguf
    

    I get about 1 token per second on an Intel E3-1245 v3. Cost of cpu + motherboard + memory: 100€, hosted in basement. Just saying.

    This is the piece on eBay (no aff), you can ask him to dump VAT and eBay fees too.

    Nice, that's a great example of how quantisation is making these things available to us mortals. And, for me, Q5 is really the sweet spot. I see a quality improvement over Q4, but I've only got slower inference and no obvious quality improvements from Q6-Q8 models.

    I've also told my model that that it runs on a server with limited resources, which doesn't improve the inference speed but it makes it get to the point with fewer tokens.

    I'm definitely thinking about a dedicated server for running them locally at some point, but I'll know when I've exceeded the viability of doing it on low end boxes and then a self-hosted server will be the only viable option

Sign In or Register to comment.