Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!


CRON service crashed
New on LowEndTalk? Please Register and read our Community Rules.

All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.

CRON service crashed

J1021J1021 Member

Today the cron service crashed on my server as the result of a load spike, as a result Observium didn't poll my devices for around 3 hours until I noticed.

How can I prevent something like this happening again in the future, or atleast have some sort of check that will notify me should the service not be running?

Comments

  • mikhomikho Member, Host Rep

    Monit?

  • AmitzAmitz Member

    Monit!

  • +1 for Monit

  • perennateperennate Member, Host Rep
    edited June 2014

    Um, and what if monit process crashes? You can set up some script that confirms cron is running, then have external server check that script (via TCP, HTTP, whatever) at interval and alert if it's not working. Easiest would be web script so you can use anything that supports HTTP check.

  • rds100rds100 Member

    A service shouldn't crash just like this, just because there was some load spike on the box. Was there an Out Of Memory situation? What is this - OpenVZ / KVM / dedi?

  • J1021J1021 Member

    Dedicated box. SWAP is at 100% but memory sits around 50% used.

  • rds100rds100 Member

    Add more swap for such cases.

  • @rds100 said:
    Add more swap for such cases.

    Adding swap is never a solution.

    In most cases, swap shouldn't be enabled at all on servers.

  • geekalotgeekalot Member
    edited June 2014

    @perennate said:
    Um, and what if monit process crashes? You can set up some script that confirms cron is running, then have external server check that script (via TCP, HTTP, whatever) at interval and alert if it's not working. Easiest would be web script so you can use anything that supports HTTP check.

    And any reasonable monitoring strategy should ALWAYS include internal/local monitor and external monitors (or at least something to monitor the monitor). Obviously, the OP has additional troubleshooting to do.

  • nerouxneroux Member

    @rds100 said:
    A service shouldn't crash just like this, just because there was some load spike on the box. Was there an Out Of Memory situation? What is this - OpenVZ / KVM / dedi?

    Fully agreed. An actual crash should be reported to the developers.

Sign In or Register to comment.