Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!


Monitoring alerting down, but server working
New on LowEndTalk? Please Register and read our Community Rules.

All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.

Monitoring alerting down, but server working

I have a server at OVH with proxmox 6.

I use two monitoring systems, HetrixTools and nMon to avoid false positives.

Almost every day, I receive alerts from both systems that the server is not sending data (down), however the server is working normally, I can ping, access the proxmox GUI, SSH and everything, I'm also monitoring all VMs on that proxmox server, and there is no alert on them.

This problem always happens during the early hours, between 4 AM to 9 AM, it never happened outside of these hours.

I monitor other proxmox servers with the same systems, and in none of them does this happen.

It is not a firewall problem, I tried to leave it disabled and the problem occurs the same way.

Does anyone have any idea what it could be?

Comments

  • hostworldhostworld Member, Host Rep

    I would look into the ICMP rate limit value. If you have multiple services monitoring your server all sending ICMP coincidentally at the same time it may be this which is causing the problem.

  • HBAndreiHBAndrei Member, Top Host, Host Rep

    @hostworld said: I would look into the ICMP rate limit value. If you have multiple services monitoring your server all sending ICMP coincidentally at the same time it may be this which is causing the problem.

    No, as far as I understand he is using Heartbeat monitoring, so no external locations are pinging his server, rather his server must send data every minute to be considered up.


    To OP: I would look into cronjobs working properly during that time, and also look into your server's ability to perform DNS resolutions and ability to load up websites.

    We actually have an article with some steps you could take to try and debug agent data sending issues on your server:
    https://docs.hetrixtools.com/what-to-do-if-the-server-monitoring-agent-isnt-sending-any-data/

    Of course the spectrum of issues could be much wider, but it's a good place to start.

    Cheers.

  • mywebhostingmywebhosting Member, Host Rep

    @juniorrrrr said:
    I have a server at OVH with proxmox 6.

    I use two monitoring systems, HetrixTools and nMon to avoid false positives.

    Almost every day, I receive alerts from both systems that the server is not sending data (down), however the server is working normally, I can ping, access the proxmox GUI, SSH and everything, I'm also monitoring all VMs on that proxmox server, and there is no alert on them.

    This problem always happens during the early hours, between 4 AM to 9 AM, it never happened outside of these hours.

    I monitor other proxmox servers with the same systems, and in none of them does this happen.

    It is not a firewall problem, I tried to leave it disabled and the problem occurs the same way.

    Does anyone have any idea what it could be?

    Did you monitor the load between 4 AM to 9 AM? May be load will be high at these time.

  • Webdock_ioWebdock_io Member, Host Rep

    How does your CPU utliziation look when you get these alerts? If all your vms are linux machines, then check crontab daily as you might be hitting a synchronized cronjob on all vms which cause a cpu spike and degraded performance for a while - we've seen this on "overcommited" hosts/nodes where strange things were happening around the same time every day: The cause was simply all virtual machines kicked into high gear executing a cron job

  • launchvpslaunchvps Member, Patron Provider

    4 AM to 9 AM could also be prime time for overnight backups to run perhaps to off server NAS.

    So possibly the switch/link (shared?) could be saturated briefly while this data is being sent to other places on the network.

Sign In or Register to comment.