Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!


Shells Virtual Desktop
BMail.ag - Secure Email Service
Server.net
CPLicense.net
VPS Server
Buy VPN
Vultr
VMs for AI
HostDare
ReliableSite White-Label Dedicated Hosting for Resellers
InterServer VPS
BMail.ag - Secure Email Service
Best VPN
High-Performance Bare Metal Server Solutions
Karvl.com
Server Mania Cloud Hosting
DataWagon Hosting
AlphaVPS Hosting
Evoxt.com
Clouvider
VPS Hosting with NVMe
Residential IPs in the US & 4G Mobile Proxies in EU & US with Unlimited Bandwidth
ReliableSite White-Label Dedicated Hosting for Resellers
Rabisu - Hosting Solutions
Shells Virtual Desktop
New on LowEndTalk? Please Register and read our Community Rules.

All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.

x86-64-v1 is all you need? What does cherry-picking CPU virt model actually help providers with?

Today I bought a VPS from a newly formed provider. They seem to be using the Virtualizor panel, and the CPU model template they selected is “Common KVM processor”.

From what I can see, that template only exposes something close to x86-64-v1 feature levels. In other words, it does not enable most of the x86-64 extensions introduced back in 2013, such as AVX and AES-NI. It does not even seem to include things like SSE3 or POPCNT in x86-64-v2.

For reference, the common x86-64 feature levels are roughly:

x86-64-v1: CMOV, CX8, FPU, FXSR, MMX, OSFXSR, SCE, SSE, SSE2

x86-64-v2: CMPXCHG16B, LAHF-SAHF, POPCNT, SSE3, SSE4_1, SSE4_2, SSSE3

x86-64-v3: AVX, AVX2, BMI1, BMI2, F16C, FMA, LZCNT, MOVBE, OSXSAVE

x86-64-v4: AVX512F, AVX512BW, AVX512CD, AVX512DQ, AVX512VL

When I asked the provider about it, they said they believe anything beyond x86-64-v1 is unnecessary for most users, and that if someone wants a different CPU model, they can open a ticket and request it.

That feels a bit strange to me. If their physical CPUs are Xeon E5-v3, which support x86-64-v3 features, why limit the guest CPU instruction set by default?

I did some searching, and it seems mainstream Linux distros like Ubuntu still build packages targeting older baselines. So most software will still run fine on x86-64-v1.

But at the same time, it is pretty obvious that instruction sets like AVX can help with workloads such as databases, media encoding/decoding, and other compute-heavy tasks. AES-NI can also speed up TLS encryption/decryption. And of course, these features can make a pretty big difference in Geekbench scores too.

So I’m wondering: what is the actual benefit for the VPS provider here?

From my point of view, this seems to:

  • lower the benchmark performance of the CPU,

  • potentially increase CPU usage for the same real workload, because older instructions may need more CPU cycles to do the same job. Thus increasing steal time.

If you’re a hosting provider, how do you decide what virtual CPU model to expose to customers?

Thanked by 1Shot2
What level do you currently provide on your VPS by default?
  1. x86-64-v1 (SSE, SSE2)13 votes
    1. x86-64-v2 (SSE3, SSE4, POPCNT)
        7.69%
    2. x86-64-v3 (AVX, AVX2)
      23.08%
    3. x86-64-v4 (AVX512)
      15.38%
    4. Even newer (e.g. AMX)
      53.85%

Comments

  • olokeoloke Member, Host Rep
    edited March 15

    We (onidel) passthrough the CPU of the host. There is no point in artificially limiting the features for our customers. I think this approach is pretty common nowadays, on most hosting providers.

    Restricting CPU features may lead to reduced performance within the VM (since the guest OS is not able to use optimal instruction set) and as a consequence, higher load on the hypervisor itself.

    The only real downside of this approach lack of live migration between hypervisors with different CPUs, since CPU features can't change without OS reboot. However if all hypervisors have the same CPU model, live migration is achievable.

    When I asked the provider about it, they said they believe anything beyond x86-64-v1 is unnecessary for most users.

    Modern Linux distros are looking to provide packages compatible with newer x86-64-v3 architecture version to increase performance of precompiled binaries:
    https://developers.redhat.com/articles/2024/01/02/exploring-x86-64-v3-red-hat-enterprise-linux-10
    https://www.omgubuntu.co.uk/2025/11/ubuntu-amd64v3-x86-64-v3-support

    So I wouldn't say it's "unnecessary for most users" in 2026.

    ps. I would also like to hear what @forest thinks on this topic.

  • tentortentor Member, Host Rep

    I often see whatever "host-model" is, so depends on an actually used CPU generation

    Thanked by 1nopint3
  • zedzed Member

    Are they intentionally limiting users or just don't know what they're doing?

    Thanked by 1nopint3
  • nopint3nopint3 Member

    @zed said:
    Are they intentionally limiting users or just don't know what they're doing?

    They did this intentionally :'(

    Thanked by 2oloke forest
  • MannDudeMannDude Patron Provider, Veteran

    @nopint3 said:

    @zed said:
    Are they intentionally limiting users or just don't know what they're doing?

    They did this intentionally :'(

    This.

    I don't think it's even a default value for Virtualizor or VirtFusion for new hypervisor deployments or anything.

    Thanked by 1nopint3
  • yoursunnyyoursunny Member, IPv6 Advocate
    edited March 15

    We actively use x86-64-v2 instructions in our code.

    https://github.com/usnistgov/ndn-dpdk/blob/67f84648ea71c18aae7e6306f872a235207f5c5d/csrc/iface/reassembler.c#L50-L56

    __attribute__((nonnull)) static inline void
    Reassembler_Drop_(Reassembler* reass, LpL2* pm, hash_sig_t hash) {
      Reassembler_Delete_(reass, pm, hash);
    
      reass->nDropFragments += pm->fragCount - rte_popcount32(pm->reassBitmap);
      rte_pktmbuf_free_bulk((struct rte_mbuf**)pm->reassFrags, pm->fragCount);
    }
    

    This function is part of the reassembler in a hop-by-hop fragmentation and reassembly protocol.
    In the data structure, reassBitmap is a uint32 bitmap that indicates which fragments have arrived, and fragCount is the total number of fragments.
    If the reassembly fails due to timeout, the function is called, where the nDropFragments counter must be incremented with the number of dropped fragments.
    Instead of keeping an extra counter regarding "how many fragments have arrived", we used the POPCNT instruction.

    The underlying library also has specialized optimization for x86-64-v3 that can be applied automatically.
    However, we do not actively directly invoke AVX instructions.

    Thanked by 1nopint3
  • "When I asked the provider about it, they said they believe anything beyond x86-64-v1 is unnecessary for most users, and that if someone wants a different CPU model, they can open a ticket and request it."

    Are you kidding me ? Do they expect all users to run a 2005 era web page or what ?
    Modern software can greatly benefit from the expanded ISA and v4 extensions is already like 5+ years old for AI inference running on servers.

    Thanked by 2nopint3 forest
  • Reading this thread sent a flashbang at near midnight

    Thanked by 1nopint3
  • igcttigctt Member

    @yoursunny said:
    We actively use x86-64-v2 instructions in our code.

    https://github.com/usnistgov/ndn-dpdk/blob/67f84648ea71c18aae7e6306f872a235207f5c5d/csrc/iface/reassembler.c#L50-L56

    __attribute__((nonnull)) static inline void
    Reassembler_Drop_(Reassembler* reass, LpL2* pm, hash_sig_t hash) {
      Reassembler_Delete_(reass, pm, hash);
    
      reass->nDropFragments += pm->fragCount - rte_popcount32(pm->reassBitmap);
      rte_pktmbuf_free_bulk((struct rte_mbuf**)pm->reassFrags, pm->fragCount);
    }
    

    This function is part of the reassembler in a hop-by-hop fragmentation and reassembly protocol.
    In the data structure, reassBitmap is a uint32 bitmap that indicates which fragments have arrived, and fragCount is the total number of fragments.
    If the reassembly fails due to timeout, the function is called, where the nDropFragments counter must be incremented with the number of dropped fragments.
    Instead of keeping an extra counter regarding "how many fragments have arrived", we used the POPCNT instruction.

    The underlying library also has specialized optimization for x86-64-v3 that can be applied automatically.
    However, we do not actively directly invoke AVX instructions.

    i remember there were different architectural optimizations using assembly code in dpdk library. especially these rte_ functions, and they intend to use the latest features of hardware

  • layer7layer7 Member, Host Rep, LIR

    Hi,

    IF you do not run any software that benefit from those CPU extension THEN it does not matter.

    But we talk here about (nowadays) standard applications like VPN / media processing / encryption related stuff.

    If there is no CPU extension available that can be used, then the job has to be done through the CPU in "software" mode. That will consume significantly more CPU because the CPU has to math it in much more complicated / timeconsuming way as it has no access to the "hardware" acceleration that comes with the CPU features/extensions.

    So there is a lot IF inside... i would actually not use providers who does not offer cpu passthrough or at least very recent emulations like Epyc or the intel equivalent.

    With this very basic cpu emulations even simple tasks will consume CPU power like hell. So the total value/quality of the server is reduced a lot because YES the software will run, but your CPU consume will explode. So simple tasks will eat your available CPU power away and will run with (feelable) bad performance.

    Thanked by 1nopint3
  • forestforest Member
    edited March 18

    @oloke said: The only real downside of this approach lack of live migration between hypervisors with different CPUs, since CPU features can't change without OS reboot. However if all hypervisors have the same CPU model, live migration is achievable.

    Hosts that need to do live migration typically set the host model to the least common denominator. For example, if their oldest system is Broadwell, then they will set the host model to Broadwell. This is still leaps and bounds better than the default QEMU model which is meant to be compatible with Pentium 4-era software.

    @layer7 said: IF you do not run any software that benefit from those CPU extension THEN it does not matter.

    All software benefits from it, actually. It's not only heavy cryptography and video encoding that benefit. Even simple loops often get auto-vectorized by the compiler, and there are extensions that are not intended for performance but for security (see below).

    @oloke said: Restricting CPU features may lead to reduced performance within the VM (since the guest OS is not able to use optimal instruction set) and as a consequence, higher load on the hypervisor itself.

    It doesn't only improve performance, but security as well. The Linux kernel will automatically make use of hardware security features when they are present, such as SMEP (Supervisor Mode Execution Prevention, which unconditionally blocks execution of user pages in kernelspace), SMAP (Supervisor Mode Access Prevention, which blocks access of user pages when in kernel mode, except for in carefully-guarded APIs intended for user-kernel copies), and UMIP (User Mode Instruction Prevention, which blocks several sensitive instructions outside of kernel mode). The first two stop a wide variety of severe kernel-level exploits dead in their tracks. It's a huge waste if the CPU supports those features but the guest is unable to use them. Seriously, it's 2026. The kernel should not be able to jump into user code, ever.

    And you can say what you want about the trustworthiness of RDRAND/RDSEED, but at least they provide unique numbers that prevent a common VM-related randomness problems when they are injected into the kernel entropy pool.

    @nopint3 said: When I asked the provider about it, they said they believe anything beyond x86-64-v1 is unnecessary for most users, and that if someone wants a different CPU model, they can open a ticket and request it.

    So they don't think guest security is important. You should tell us what the provider is. I'd like to avoid it.

Sign In or Register to comment.