All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
Project idea: Services checker and re-starter
Hi,
I've had an issue earlier with one of my VPS where the issue became annoying: service collapsing causing server unresponsiveness. Which is apparently out of shortness in memory, with possibility of overselling at provider's side.
I've opened a thread about that issue. And came somehow to a suggestion that might be useful anyway, which is:
A local script that runs at fixed time interval with a cron job, and it check user's critical services (ie Apache/Nginx, Mysql server, DNS..) , and only send an email notification in the case it finds a service is down (and might be set to an email that forwards messages to an sms). Means: No notifications sent=Things are (supposedly) up and running.
However rds100 have suggested that:
@rds100 said: Local bash script is not the best solution, because it can die too, the same way as other services die. The monitoring must be remote.
That's true since running the script won't succeed at times if no memory is available or for other performance reasons. So a remote monitoring service was suggested.
However once more I've suggested a local script, but that also listens on a port and can be monitored externally with a 3rd party monitoring service that notifies by sms/email just when the script itself is down. As I see in this case the monitoring script needs to run all the time in the background, so probably no need for cron jobs.
So, the idea for now is:
A script runs locally on VPS, stays in the background and monitors user-specified critical services. Once any of it is down for x period of time, an email notification is sent to the user including the service that's down for x sequential number of times. OR, the monitoring script automatically tries to restart the service for x number of times before sending each notification. In the case the service is started successfully then a notification is sent stating the service went down but was successfully started- and no action required by user.
While in the case the monitoring script itself is down, and as the script itself listens on a port, an external 3rd party monitoring is set to monitor it's uptime, so an email notification is sent from the external monitoring service that indicating once the monitoring script/full VPS is down or not responding.
That's it. I thought I would just throw it for evaluation if it's useful or not. Plus the fact that I don't think I will be trying to do it by myself anytime soon.
Comments
You could take a look at @NickM 's OpenStatus. It does service monitoring, and email notification.
Yes it can do it more or less, however the difference is this one supposedly runs on the same VPS locally. Also it shall try to start died service before sending each notification.
Using a script to try to restart a service that died can lead to even worse issues, which is why OpenStatus doesn't do it. For example, with MySQL, if you're using replication and it dies and tries to restart? You might end up with inconsistent data on your slaves and you'll have to sort it out. Granted, if you're using MySQL replication, you should know that and disable it, but still...
Sign up for http://www.uptimerobot.com and create a custom port check. Done and done.
edit: Oh, you want something that automatically restarts the service too. Just use a simple bash script for that... or just use something like http://supervisord.org
If you need monitoring and restarting check monit
Thanks
Well, that's a good suggestion if it can actually get included in OpenStatus. But I don't think it's better being considered problematic just out of probably a few cases where it leads to problems. I mean starting the web server, and DNS if used, both probably won't cause a problem, also Mysql server in most cases (given replication is the main trouble source) , so that can be included in Readme, in addition to a bold notice just in case. So maybe you better consider similar thing in the future updates
It would be great if service starting commands are added just by the user himself as much as needed, in configs. So if he didn't add starting command at Mysql line in configs, then OpenStatus won't try to start Mysql. And if he didn't add any starting command next to a list of services, then no services will be started (same as feature disabled). Also might add a note next to Mysql line in configs and other services that might cause troubles by automatically restarted.
Sign up for http://www.uptimerobot.com and create a custom port check. Done and done.
edit: Oh, you want something that automatically restarts the service too. Just use a simple bash script for that... or just use something like http://supervisord.org
Thanks I have an uptimerobot account, I'll be checking the other suggestion.
Thanks for your suggestion and I might try monit. And even though there are full featured monitoring services that must be doing most of things, I was looking into a simple way to get it.
the simplest way would be a script in cron
#!/bin/bash ps -ef | grep -v grep | grep Your-Prog if [ $? -eq 1 ] then restart your program fi
from my blog here akamaras.com/linux/linux-script-to-check-if-a-service-is-running-and-start-it-if-its-stopped/
!/bin/bash ps -ef | grep -v grep | grep Your-Prog if [ $? -eq 1 ] then restart your program fi
Thank you! Going to test that, I guess cron is the way to go to make it simple.
Thank you for that great script. I will test it for sometime to figure how it goes under the real condition, once issues repeats.
Good Lord, there's a lot of wheel reinvention here.
The Unix way(*) to start a process if it fails is to use inittab, which some distros have retired in favor of upstart. 'respawn' is the configuration you want. This has been in Unix since at least the early 90s.
(*) at least for predominantly SysV-derived Unices like Linux. I don't know what the equivalent is in BSD off hand.
Take a look at my cron/screen based services restarter/starter. I use it and it works well
https://github.com/maxexcloo/User-Daemon
There is some worst case: service(for example lighttpd) process is here but sits and does nothing (service stalls)
my freebsd cron script:
Respawning a web service like Apache (and depending on how you have Apache setup it won't even work) or MySQL through init is a pain in the ass and not really the point of init. A simple cron script is a lot easier to manage, especially when you need to permanently stop the service for a while.
You can also use the daemontools
http://cr.yp.to/daemontools.html
The Unix way() to start a process if it fails is to use inittab, which some distros have retired in favor of upstart. 'respawn' is the configuration you want. This has been in Unix since at least the early 90s.
() at least for predominantly SysV-derived Unices like Linux. I don't know what the equivalent is in BSD off hand.
Right, probably I should've searched for any available/similar scripts beforehand, but I was also short on time recently.
As for service starting commands that's why I've suggested @NickM to include a config file preloaded with a list of most used services, and next to each one is a space to add starting command by the user himself and to only desired services that he wants to be started if failed, and so commands are left to the user to add depending on distro and version of programs.
Have a look at http://puppetlabs.com/.
that
Why use difficult scripts to check if a service is running? Just put "/etc/init.d/service start" in your crontab, if the service is already running nothing will happen.
https://github.com/maxexcloo/User-Daemon
Thanks. looks good, and I'm giving it a try
my freebsd cron script:
It's a good point to add in a new script as well, thank you.
Thanks
http://cr.yp.to/daemontools.html
Might eventually
What a great solution. thanks for posting that!
Thanks. And that's full featured.
Thanks! A good suggestion.Even though it won't notify by email.
Just as a hint.... why not use Monit?
http://mmonit.com/monit/
Runs as service and can easily be adapted to every app that has a PID file - Checks if the app works etc etc.
Example config for nginx:
#check nginx now check process nginx with pidfile /var/run/nginx.pid start program = "/etc/init.d/nginx start" stop program = "/etc/init.d/nginx stop" if failed host IP.IP.IP.IP port 80 protocol HTTP request / then restart if 100 restarts within 100 cycles then timeout
or php:
check process php-fpm with pidfile /var/run/php5-fpm.pid group phpcgi # phpcgi group start program = "/etc/init.d/php5-fpm start" stop program = "/etc/init.d/php5-fpm stop" ## Test the UNIX socket. Restart if down. if failed unixsocket /tmp/php-cgi.sock then restart ## If the restarts attempts fail then alert. if 100 restarts within 100 cycles then timeout
Can send email, sms and has a nice webinterface where the status can be seen and the process can be restarted manually - Multiple servers can be grouped in one interface by MMonit.
We use Hyperic at my day job (monitoring a major US retail website). Its pretty powerful and there is an open source version: hyperic.com
The best way to do this is to have a check script spawned by crown every so often, it won't take up memory whilst it's executing. Crond rarely fails.