Run AI/ML tools like Stable Diffusion on VPSes?

komdragon · December 2022

AI art generation tools such as Stable Diffusion seems to be the new trend these days, and I was wondering if anyone here has Stable Diffusion or something similar running on their VPS - if so, what steps did you take, what are the specifications of the VPS, how is it working out?

I suppose you'd need good amount of GPU cores for such so you'd need a higher end VPS to run such, so will it be better to run it on something like Google Cloud or AWS?

vyas11 · December 2022

@lentro

stoned · December 2022

It is actually possible, using CPU only. If automatic1111 web ui is installed here for example:

~/stable-diffusion/stable-diffusion-webui

then edit webui-user.sh

export COMMANDLINE_ARGS="--precision full --no-half --use-cpu all --skip-torch-cuda-test"
export CUDA_VISIBLE_DEVICES=-1

and make it use CPU only.

You will also need at minimum 12GB of RAM, 16GB to be good.

But this will cause 100% cpu usage, unless you nice things somehow or throttle the CPU, and 100% all the time is heavy use, not fair use, so providers may not like that.

There is a lowmem option, but I've not tested that.

You can go hogwild on a dedicated server with CPU-only mode though.

I'm not sure how server CPUs work, but using Eular_a at 24-32 steps, or DPM++ 2M at 12-20 steps on 5800X takes roughly 2-3 minutes per image.

I can test it out on my KS-LE-1~! See how that goes.

Here's a console output log:

$  ./webui.sh


Python 3.10.8 (main, Nov  2 2022, 15:31:22) [GCC 11.3.0]
Commit hash: 4b3c5bc24bffdf429c463a465763b3077fe55eb8
Installing requirements for Web UI
Launching Web UI with arguments: --precision full --no-half --use-cpu all
Warning: caught exception 'Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx', memory monitor disabled

Loading config from: /home/stoned/stable-diffusion/stable-diffusion-webui/models/Stable-diffusion/v2.yaml
LatentDiffusion: Running in v-prediction mode
DiffusionWrapper has 865.91 M params.
Loading weights [2c02b20a] from /home/stoned/stable-diffusion/stable-diffusion-webui/models/Stable-diffusion/v2.ckpt
Applying cross attention optimization (InvokeAI).
Model loaded.
Loaded a total of 0 textual inversion embeddings.
Embeddings: 
Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 12/12 [02:00<00:00, 10.02s/it]
Total progress: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 12/12 [02:00<00:00, 10.04s/it]

On my CPU AMD 5800X each step takes roughly 10 seconds. Other samplers may take longer. So 12x10 is 120 seconds is exactly 2minutes per the log.

Anyone care to test on various different dedicated servers?

patro · December 2022

Running or Training, if you just want to run it you don't need GPU, stable diffusion will work on cpu too for generating images. As @stoned said, 12GB of RAM is minimum recommended to run SD (for 512x512 output resolution) but you can run it on 8GB of ram too by reducing output resolution like this guy did on his RPi4 : https://github.com/straczowski/raspberry-pi-stable-diffusion

If you want to provide it as a service then CPU only is not an option because of long waiting time. If you just want to run it for fun then Google Colab free service is still better than any VPS under $50/m.

I have trained SD on AWS g4dn.xlarge which comes with 16GB RAM and Nvidia T4 which costs $0.6/h and I can say T4 is not good enough. For generation it takes around 30s for 512x512 image size.

lentro · December 2022

@vyas11 said:
@lentro

Thanks for the mention, @vyas11!

@komdsfojn At TensorDock, we run a GPU computing marketplace. You can deploy virtual machines on dozens of servers around the world that run our hypervisor. Right now, the lowest available GPU is a 3070 for $0.085/hr, but you can also grab 3090s for $0.22/hr, 4090s for $0.37/hr, and A6000s for $0.42/hr — all about 50% cheaper than alternative GPU cloud alternatives

These are backed by an API too, so you can automatically scale up/down your deployments.

Check it out!
https://marketplace.tensordock.com/order_list

Happy to give you some free credits to get started

(If you're willing to trade price with flexibility, we also have our own cloud that runs on 3x-replicated storage and 10 gbit uplinks. It's a bit more expensive for GPU compute but more reliable/secure, as it's 100% data centers-based and has storage-only billing)

twain · December 2022

When I was checking out options for this I also looked at Lambda Labs. They’re lowest on-demand instance is $.50/hr:

https://lambdalabs.com/service/gpu-cloud/pricing

Howdy, Stranger!

Categories

In this Discussion

Run AI/ML tools like Stable Diffusion on VPSes?

Comments

Howdy, Stranger!

Quick Links

Categories

In this Discussion

Run AI/ML tools like Stable Diffusion on VPSes?

Comments