Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!


Does anyone have memory allocation failures on Racknerd KVM?
New on LowEndTalk? Please Register and read our Community Rules.

Does anyone have memory allocation failures on Racknerd KVM?

davidedavide Member
edited November 2022 in Providers

I haven't been able to track the problem to its source, but every couple of weeks or so I find that either some daemon processes, like the webserver and NodeJS instances, have been unduly terminated without being supposed to, or that the KVM instance itself was unexpectedly rebooted, and upon the subsequent restart some of the automatically-started daemons are not running, such as, again, the webserver and NodeJS instances.

There are no system logs reporting either OOM conditions or errors, not even the random system reboots are logged, as if the KVM instance is hard-reset without issuing an ACPI signal; only the subsequent boots are logged, but not the preceding shut-downs, which happen accidentally and randomly every few weeks.

I suspect that Racknerd KVM instances may be over committed with respect to the host server memory, and this may cause accidental crashes to guest processes and to the VM itself. I don't think this is implausible considering the extremely low prices.

Anyone else noticing similar weirdness?

Comments

  • I actually run three VPS servers with them and have never had a similar issue. Everything is working as expected. I had a similar issue with another provider, and I asked them to migrate my VPS; they did it, and the problem is gone.

    Sorry, I can't be of much assistance! 

    Thanked by 1dustinc
  • WebProjectWebProject Member, Host Rep

    Contact support and ask them to resolve the issue and double the bandwidth and RAM, if the do oversell it so it cost them peanuts

  • Did you check your resource usage? Even if the RAM is not dedicated, you should have high RAM usage for that to happen. If you install Netdata, you will be able to track resource usage on your VPS, but that will probably not help you identify the problem. I really was not able to identify the problem with my previous provider, and I just needed to claim that by not doing anything, the instance got reboot every day.

  • dustincdustinc Member, Patron Provider, Top Host

    Hi @davide -- Thank You for being our valued customer :) This definitely should not be the case - our services are not overloaded, and quite the opposite of what we're known for (we overall have a very solid reputation for providing solid performance and services). In addition to that, we proactively monitor the health and status of all of our host nodes (though we do not monitor individual VM's with our unmanaged services).

    If you don't mind, shoot me an e-mail at [email protected] with your VPS IP, I'll double check things on our end, and help you out with a resolution path.

  • escalated quickly and resolved ... as usual ... imaging is number one B)

    Thanked by 1dustinc
  • davidedavide Member
    edited November 2022

    I'm monitoring the instance with Munin, I have graphical plots of everything. Memory usage is somewhat high, but stable, with little variance over time: 70% is allocated by user processes, 25% is cache, and 5% is for buffers or unused. I copied this filesystem tree (/etc, /home, ...) as-is from my previous VPS provider from a VM with the same amount of memory (2 GB) but at a significantly higher price. With the same OS, config files and processes, it ran for 10 years there without such issues.

  • @dustinc said:
    Hi @davide -- Thank You for being our valued customer :) This definitely should not be the case - our services are not overloaded, and quite the opposite of what we're known for (we overall have a very solid reputation for providing solid performance and services). In addition to that, we proactively monitor the health and status of all of our host nodes (though we do not monitor individual VM's with our unmanaged services).

    If you don't mind, shoot me an e-mail at [email protected] with your VPS IP, I'll double check things on our end, and help you out with a resolution path.

    After you "reintroduced" Amsterdam during this BF performance of my Amsterdam VPS that I have for 1.5years went down by 40-50%.

    For example multicore (2core) Geekbench 5 score is 600-700.
    Will you also take a look at my node if I send you mail?

  • dustincdustinc Member, Patron Provider, Top Host

    @AXYZE said:

    @dustinc said:
    Hi @davide -- Thank You for being our valued customer :) This definitely should not be the case - our services are not overloaded, and quite the opposite of what we're known for (we overall have a very solid reputation for providing solid performance and services). In addition to that, we proactively monitor the health and status of all of our host nodes (though we do not monitor individual VM's with our unmanaged services).

    If you don't mind, shoot me an e-mail at [email protected] with your VPS IP, I'll double check things on our end, and help you out with a resolution path.

    After you "reintroduced" Amsterdam during this BF performance of my Amsterdam VPS that I have for 1.5years went down by 40-50%.

    For example multicore (2core) Geekbench 5 score is 600-700.
    Will you also take a look at my node if I send you mail?

    Hi @AXYZE -- Thank You for your business over the years (wow, time goes by quick!) Happy to hear that our services have been working out well for the better part of the 1.5 years you've been with us.

    We're not seeing any monitoring events in Amsterdam at the moment, nonetheless, it doesn't hurt to double check - feel free to send me an e-mail with your VPS IP and I'd be happy to take a look, we can evaluate real usage, etc. [email protected]

    We sincerely appreciate your business and look forward to working with you for another 1.5 years, and then some more :) Thank You again.

  • I have a VPS in LA and so far no problem

    Thanked by 1dustinc
  • davidedavide Member
    edited November 2022

    @dustinc said:
    Hi @davide -- Thank You for being our valued customer

    Hi dustin,

    we have already discussed this issue months ago, and to no avail you blamed me for incompetence in improperly sizing the VPS, prompting me to buy an expansion pack of even more (oversold) system memory.

    As you know, the same malfunctions I described here were already reported multiple times on TrustPilot by numerous customers, me included. These negative reviews have now mysteriously disappeared. My own review went down under the pretext that I violated TrustPilot's ToS despite being honest, eloquent and polite.

    Only one of these reviews can still be found on archive.org

    Thanked by 1pedagang
  • Daniel15Daniel15 Member
    edited November 2022

    @davide said: I'm monitoring the instance with Munin

    Munin runs every 5 minutes, so you'll likely miss any anomalies unless they last longer than a few minutes. I'd recommend upgrading to Netdata instead, which updates its stats every second. I switched a few years ago and would recommend it.

    Thanked by 1davide
  • dustincdustinc Member, Patron Provider, Top Host
    edited November 2022

    @davide said:

    @dustinc said:
    Hi @davide -- Thank You for being our valued customer

    Hi dustin,

    we have already discussed this issue months ago, and to no avail you blamed me for incompetence in improperly sizing the VPS, prompting me to buy an expansion pack of even more (oversold) system memory.

    As you know, the same malfunctions I described here were already reported multiple times on TrustPilot by numerous customers, me included. These negative reviews have now mysteriously disappeared. My own review went down under the pretext that I violated TrustPilot's ToS despite being honest, eloquent and polite.

    Only one of these reviews can still be found on archive.org

    Hi @davide -- appreciate you following up. I was able to pull up your account, and in the one and only support ticket I am able to see under your account (let me know if I'm mistaken, or if you may have created a ticket from a different email, etc), it looks like our team had reached out to you as a follow up (but we did not hear back from you after our response on July 7), suggesting that you may have OOM'd as a result of the applications running within your VPS. You mentioned here you purchased an expansion/upgrade pack, but I am not seeing any upgrades or addons purchased, your VPS is still intact with the original included amount of RAM with the package (again - feel free to correct me if I'm mistaken) nor do I see any communication within your account related to purchasing an upgrade. In your ticket, you mentioned some concerns about your VM utilizing swap space, and I do see that our team followed up and made some suggestions for you with regards to optimizing your vm.swappiness level from within your VM's operating system level. By default most Linux distributions have the vm.swappiness level at 60, and also keep in mind https://www.linuxatemyram.com/ - so it is normal for some swap usage to be utilized unless you specifically notate otherwise in the vm.swappiness level within your OS.

    Furthermore, I just personally checked the node your VPS resides on (DAL107KVM) and do not see any current issues nor any resource contention (this node actually looks to be pretty quiet in terms of resource utilization). Even still we'd like to work with you on a resolution path here, if you don't mind, ultimately we're only happy if our customers are happy, so let's work together on identifying an amicable solution. I'll be reaching out to you to the e-mail on file shortly and we can go from there.

  • jackbjackb Member, Host Rep
    edited November 2022

    Just a thought @dustinc

    There are no system logs reporting either OOM conditions or errors, not even the random system reboots are logged, as if the KVM instance is hard-reset without issuing an ACPI signal; only the subsequent boots are logged, but not the preceding shut-downs, which happen accidentally and randomly every few weeks.

    I've seen this sort of thing a few times before -- where there was sufficient memory available on the host but VMs were OOM killed anyway.

    This would be visible in the qemu log file and messages log file on the host.

    Probably worth a look. I found that the disk cache mode of the VM was a trigger -- in writeback the issue would occur; in none - it wouldn't.

  • dustincdustinc Member, Patron Provider, Top Host

    @jackb said:
    Just a thought @dustinc

    There are no system logs reporting either OOM conditions or errors, not even the random system reboots are logged, as if the KVM instance is hard-reset without issuing an ACPI signal; only the subsequent boots are logged, but not the preceding shut-downs, which happen accidentally and randomly every few weeks.

    I've seen this sort of thing a few times before -- where there was sufficient memory available on the host but VMs were OOM killed anyway.

    This would be visible in the qemu log file and messages log file on the host.

    Probably worth a look. I found that the disk cache mode of the VM was a trigger -- in writeback the issue would occur; in none - it wouldn't.

    Appreciate this @jackb -- I can confirm that by default we do have disk cache set to ‘none’, unless specifically requested otherwise by the customer.

    I’ve reached out to the OP via email shortly after my latest reply here (currently pending his response) -- always happy to dig deeper and curious to do so. Based on our very last discussion as commented above, we’d be happy to see where things are at today within his environment, and take it from there.

  • davidedavide Member
    edited November 2022

    @Daniel15 said:
    Munin runs every 5 minutes, so you'll likely miss any anomalies unless they last longer than a few minutes. I'd recommend upgrading to Netdata instead, which updates its stats every second. I switched a few years ago and would recommend it.

    That's a good idea, to monitor memory usage at shorter intervals. I have Munin already too much customized and integrated with a fleet of other monitored VMs to throw it away with nonchalance, so for the moment I wrote this Bash script to log memory usage on my Racknerd VM:

    #!/bin/bash
    
    while :; do
        date
        free -m
        echo
    
        sleep 1
    done >monitor.log
    

    Right now it's printing values close to these: (all values in MB)

    Tue 29 Nov 2022 00:33:19 AM CET
                  total        used        free      shared  buff/cache   available
    Mem:           1995        1427         264           1         302         617
    Swap:           510           0         510
    

    So far, with 20 minutes logged, there's almost no variance in memory consumption over time, consistently with the Munin charts.

    If I keep getting crashes that are not justified by memory exhaustion within the VM, I'll find a way to further reduce memory allocations in userspace, until the Racknerd contract expires in a few months. We'll see.

Sign In or Register to comment.