Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!


Server having packet loss after having more than 200 VMs on it?
New on LowEndTalk? Please Register and read our Community Rules.

All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.

Server having packet loss after having more than 200 VMs on it?

Hi,
We have an Epyc server with connect x 3 but we have a problem. After having more than like 200vms on it. It having packet loss. The network load on the server is not much and we use bridges for the KVM VMs.
Any thoughts?

Comments

  • I can tell your fortune from coffee scraps or palm, but I can't tell you anything about server issues without proper suplementation of logs.

    Thanked by 2emgh jazzii
  • Could be god.

    Thanked by 2mrs92 Hxxx
  • @emgh said:
    Could be god.

    image

    Thanked by 5emgh ralf mrs92 titus jazzii
  • @emgh said:
    Could be god.
    @jmgcaguicla said:
    image

    Maybe ... at the bottom of the sea now

  • Stick to the 199 VM's. Problem solved.

  • WebProjectWebProject Host Rep, Veteran

    it depends how all these 200 VMs consume bandwith at the same time as you are limited with network port speed

  • ShakibShakib Member, Patron Provider
    edited November 2022

    200?

    More than 20 VM on a single node makes me feel sick.

    Get a good network card if you don't have one already. We had some issues on one node before when some clients was running bridges and interfaces of their own inside the VMs (aka Docker, VPN etc.). The card solved the stability and lagging issue.

    Thanked by 1Hayashima
  • HxxxHxxx Member
    edited November 2022

    A true low-end provider. Love the fact that you have 200 VM's running in a single node.
    Can you share more statistics or specs of the node? Really interested. Maybe run a jab test?

  • Haha the node is dual Epyc 7542 with 2tb of RAM. Actually when we have over 100 VMs on it. It happens. More VMs more packet loss happens. We think the problem is the bridge and want to try open switch. Any thoughts?

    Thanked by 1Hayashima
  • When you say packet loss, are you referring to packet loss on the WAN, or LAN? If it's the WAN, have you ensured your uplink isn't congested? what network card do you have in the server?

  • @dbContext said:
    When you say packet loss, are you referring to packet loss on the WAN, or LAN? If it's the WAN, have you ensured your uplink isn't congested? what network card do you have in the server?

    It's on the Lan, the uplink is good.
    It's mellanox connect x 3 on 10g

  • HxxxHxxx Member
    edited November 2022

    That's an amazing machine you got there. Beefy as F.
    Anyway, haven't experienced that issue before, but if CPU, IO and RAM are all fine in terms of load and assuming you are not maximizing your port, then it might be the LAN card? Try using another card. Shakib also mentioned that.

    @Francisco might have experience on this.

    Thanked by 1Yakooza
  • I don't think it's the NIC card. There should be somewhere a limit. This happens when exactly it hits 100vms.

  • PieHasBeenEatenPieHasBeenEaten Member, Host Rep

    Well how many bridges you running?

  • @tronyx said:
    Stick to the 199 VM's. Problem solved.

    "199 is the magic number"
    takes note

  • Maybe some limit? open files, network interface limit?

  • NeoonNeoon Community Contributor, Veteran

    @Hxxx said:
    A true low-end provider. Love the fact that you have 200 VM's running in a single node.
    Can you share more statistics or specs of the node? Really interested. Maybe run a jab test?

    A true low-end-provider puts 500 of these on Raid 1 HDD's without cache.

    Thanked by 4Hxxx crilla lanefu Zyra
  • we narrowed down. Seems the problem is with ebtables. It slows down and can't filter these many packets at onces. We may want to try nftables. Any suggestions?

  • what virtualization platform do you use?

  • You need a user-space TCP stack like DPDK. I think OpenVSwitch supports DPDK. Also, found this link https://support.mellanox.com/s/article/mellanox-dpdk

    Thanked by 1Yakooza
  • @masiqbal said:
    what virtualization platform do you use?

    Qemu, on Virtualizor CP.

Sign In or Register to comment.