RTX 4060 (or similar)+Ryzen 7900 AIO Tower Server (Taking pre-orders) - HostCram LLC

OpaqueRegistrant · February 23

@maws said:
If you are selling these and have US presence that can be legally risky don't forget DC use is not permitted for RTX cards

I don't think there's any real legal risk. There's no law that stops you from sharing a GPU. Nvidia just doesn't like that you do it, and if they find out, they can stop selling you more GPUs, but that hardly matters when you only need 3 of them and you can get them from Best Buy or whatever. It would be a problem if you were trying to buy 1000 GPUs direct from Nvidia.

They can also put code in their drivers to try to stop you from doing it, but you can try to remove that code out of their drivers. Technically it might be illegal under the DMCA anti-circumvention clause, but it would be a little bit tenuous and has anyone ever been sued for something like that for a driver before? If open source reverse engineered drivers can exist, so can that.

Me, I'm always interested in unusual offers, I don't know what I'd do with a 1/24 slice of a GPU though.

Shakib · February 23

root@mars:~# dmidecode --type 17 | grep -i "Speed"
Speed: 4800 MT/s
Configured Memory Speed: 5600 MT/s
Speed: 4800 MT/s
Configured Memory Speed: 5600 MT/s
Speed: 4800 MT/s
Configured Memory Speed: 5600 MT/s
Speed: 4800 MT/s
Configured Memory Speed: 5600 MT/s

Stable.

Shakib · February 23

root@mars:~# lspci -nnk | grep -A3 "Battlemage"
03:00.0 VGA compatible controller [0300]: Intel Corporation Battlemage G21 [Intel Graphics] [8086:e212]
Subsystem: Intel Corporation Device [8086:1114]
Kernel driver in use: xe
Kernel modules: xe
03:00.1 VGA compatible controller [0300]: Intel Corporation Battlemage G21 [Intel Graphics] [8086:e212]
Subsystem: Intel Corporation Device [8086:1114]
Kernel driver in use: xe
Kernel modules: xe
03:00.2 VGA compatible controller [0300]: Intel Corporation Battlemage G21 [Intel Graphics] [8086:e212]
Subsystem: Intel Corporation Device [8086:1114]
Kernel driver in use: xe
Kernel modules: xe
03:00.3 VGA compatible controller [0300]: Intel Corporation Battlemage G21 [Intel Graphics] [8086:e212]
Subsystem: Intel Corporation Device [8086:1114]
Kernel driver in use: xe
Kernel modules: xe
03:00.4 VGA compatible controller [0300]: Intel Corporation Battlemage G21 [Intel Graphics] [8086:e212]
Subsystem: Intel Corporation Device [8086:1114]
Kernel driver in use: xe
Kernel modules: xe
03:00.5 VGA compatible controller [0300]: Intel Corporation Battlemage G21 [Intel Graphics] [8086:e212]
Subsystem: Intel Corporation Device [8086:1114]
Kernel driver in use: xe
Kernel modules: xe
03:00.6 VGA compatible controller [0300]: Intel Corporation Battlemage G21 [Intel Graphics] [8086:e212]
Subsystem: Intel Corporation Device [8086:1114]
Kernel driver in use: xe
Kernel modules: xe
03:00.7 VGA compatible controller [0300]: Intel Corporation Battlemage G21 [Intel Graphics] [8086:e212]
Subsystem: Intel Corporation Device [8086:1114]
Kernel driver in use: xe
Kernel modules: xe
03:01.0 VGA compatible controller [0300]: Intel Corporation Battlemage G21 [Intel Graphics] [8086:e212]
Subsystem: Intel Corporation Device [8086:1114]
Kernel driver in use: xe
Kernel modules: xe
03:01.1 VGA compatible controller [0300]: Intel Corporation Battlemage G21 [Intel Graphics] [8086:e212]
Subsystem: Intel Corporation Device [8086:1114]
Kernel driver in use: xe
Kernel modules: xe
03:01.2 VGA compatible controller [0300]: Intel Corporation Battlemage G21 [Intel Graphics] [8086:e212]
Subsystem: Intel Corporation Device [8086:1114]
Kernel driver in use: xe
Kernel modules: xe
03:01.3 VGA compatible controller [0300]: Intel Corporation Battlemage G21 [Intel Graphics] [8086:e212]
Subsystem: Intel Corporation Device [8086:1114]
Kernel driver in use: xe
Kernel modules: xe
03:01.4 VGA compatible controller [0300]: Intel Corporation Battlemage G21 [Intel Graphics] [8086:e212]
Subsystem: Intel Corporation Device [8086:1114]
Kernel driver in use: xe
Kernel modules: xe
root@mars:~#

We can split the GPU and assign it to 12 VMs.

Nanja · February 23

You mentioned that the GPU will be split and each customer will receive a slice.

I have a few questions about how that works technically:

How is the VRAM allocated when the GPU is sliced? Is the memory statically partitioned per user, or dynamically shared?
If I want to run AI models for media generation (which typically require loading dense models entirely into VRAM), will my allocated slice have dedicated VRAM, or will it be shared with others?
Since large AI models must fully reside in VRAM to function properly, how does that work under GPU slicing?

Does each user get isolated VRAM space?

Or could memory contention impact performance or cause failures?

Are there any limitations on running AI inference workloads under this setup?

Shakib · February 24

@Nanja said:
You mentioned that the GPU will be split and each customer will receive a slice.

I have a few questions about how that works technically:

How is the VRAM allocated when the GPU is sliced? Is the memory statically partitioned per user, or dynamically shared?

If I want to run AI models for media generation (which typically require loading dense models entirely into VRAM), will my allocated slice have dedicated VRAM, or will it be shared with others?

Since large AI models must fully reside in VRAM to function properly, how does that work under GPU slicing?

Does each user get isolated VRAM space?

Or could memory contention impact performance or cause failures?

Are there any limitations on running AI inference workloads under this setup?

Shakib · February 24

@Shakib said:

@Nanja said:
You mentioned that the GPU will be split and each customer will receive a slice.

I have a few questions about how that works technically:

How is the VRAM allocated when the GPU is sliced? Is the memory statically partitioned per user, or dynamically shared?

If I want to run AI models for media generation (which typically require loading dense models entirely into VRAM), will my allocated slice have dedicated VRAM, or will it be shared with others?

Since large AI models must fully reside in VRAM to function properly, how does that work under GPU slicing?

Does each user get isolated VRAM space?

Or could memory contention impact performance or cause failures?

Are there any limitations on running AI inference workloads under this setup?

It works way better than I expected. I didn't had to bypass anything.

Intel Graphics Panel shows 1GB Dedicated Video Memory, Windows says it's 1.5GB and I think it's around 1.3GB per VM.

YOU CAN ALSO USE RAM AS VMEM.

Will do further testing. We need to modify the Windows temples and setup Graphics driver auto install script.

Shakib · February 24

1320 MB Dedicated Video Memory.

Google Earth is smooth as F**! :

Shakib · February 24

@Nanja said:
You mentioned that the GPU will be split and each customer will receive a slice.

I have a few questions about how that works technically:

How is the VRAM allocated when the GPU is sliced? Is the memory statically partitioned per user, or dynamically shared?

If I want to run AI models for media generation (which typically require loading dense models entirely into VRAM), will my allocated slice have dedicated VRAM, or will it be shared with others?

Since large AI models must fully reside in VRAM to function properly, how does that work under GPU slicing?

Does each user get isolated VRAM space?

Or could memory contention impact performance or cause failures?

Are there any limitations on running AI inference workloads under this setup?

To answer your question directly: while your memory is strictly partitioned, your 3D and video processing are time-sliced.

Here is exactly how the hardware handles the split behind the scenes using Intel's GuC (Graphics microController) hardware scheduler:
1. 3D and Compute (Execution Units & AI)
Unlike your VRAM, which is hard-locked to roughly 1320 MB per slice (16 GB total divided by 12, minus a little overhead), the actual processing cores—the 16 Xe2 Cores, 128 XMX AI engines, and Ray Tracing units—are not physically divided into 12 mini-GPUs.
Instead, they are shared using a time-slicing approach.

• How it works: The GPU's hardware scheduler gives each active Virtual Function (your VM slice) exclusive access to the entire processing power of the GPU for a fraction of a millisecond. Then, it rapidly context-switches to the next VM in the queue.

• The benefit: If 11 of your VMs are idle, the 1 active VM can burst and utilize nearly 100% of the B50's 3D render and compute performance.

• The catch: If all 12 VMs are running heavy 3D loads at the exact same time, the scheduler divides the processing time equally, meaning each VM will effectively get roughly 1/12th of the performance.

Video Processing (Media Engines)
The Intel Arc Pro B50 has dedicated hardware media engines (Multi-Format Codec Engines) that handle hardware encode/decode for AV1, HEVC, and H.264.

• How it works: Just like the 3D compute, the video engines are time-multiplexed across your 12 slices.

• Real-world impact: If you have multiple VMs running Plex, Jellyfin, or NVR software like Blue Iris, they all share the same physical media encoders. The hardware scheduler seamlessly handles the queueing. If one VM requests a video transcode, it uses the full speed of the media engine. If three VMs are transcoding simultaneously, the GPU rapidly switches between their workloads, dividing the encoder's total throughput (frames per second) among them.

The Bottom Line
• VRAM (1320 MB): Hard-partitioned. No VM can ever exceed its 1320 MB allocation, which is why it shows up as dedicated memory.

• 3D/Compute/Video: Time-sliced. Your VMs get maximum burst performance when neighboring slices are idle, and proportional performance when they are competing for resources.

I will setup a GPU usage monitor so the GPU can't be abused by a single client.

Hitori0221 · February 25

looks good, received an email regarding the service earlier today

Shakib · February 25

Dual GPU Slice also works fine.

Shakib · February 28

We are ready to take orders.

Shakib · February 28

Monster: KVM-12G + Pro B50 GPU

4x RYZEN 9 7900 Core + Pro B50 GPU
12 GB DDR5 RAM + 1.3 GB VMEM
120 GB NVMe 5.0 Storage
40 TB BW @ 10 Gbps Port
1 Dedicated IPv4 (Free IPv6)

This plan comes with Premium Support and Cloud Portal access with automated OS Rebuild, Start, Stop, Restart, Console, etc. management options.

https://my.hostcram.com/order/main/packages/special-offers/

Shakib · February 28

Intel Arc Pro B50 (Battlemage) on Linux — Setup Guide

https://gist.github.com/Bortus-AI/c9a79371b561c716874ba2cc2bd2f3cf

Credit: Bortus-AI

Howdy, Stranger!

Categories

In this Discussion

RTX 4060 (or similar)+Ryzen 7900 AIO Tower Server (Taking pre-orders) - HostCram LLC

Comments

https://my.hostcram.com/order/main/packages/special-offers/

Howdy, Stranger!

Quick Links

Categories

In this Discussion

RTX 4060 (or similar)+Ryzen 7900 AIO Tower Server (Taking pre-orders) - HostCram LLC

Comments

https://my.hostcram.com/order/main/packages/special-offers/