New on LowEndTalk? Please Register and read our Community Rules.
All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
what do you guys use to monitor servers?
Hi,
we have 200+ servers across several locations. We use newrelic/nagios to monitor them.
However we are finding it difficult to find certain business centric data such as follow
Which servers are not optimally performing?
Which servers required upgrade
Is there a solution which will show us all server stats in a single page with average data for past 15 days.
Comments
It seems many LowEndProviders are using the well tested CBID - Customer Based Issue Detection, It really simple and requires minimum setup. All you have to do is wait for a ticket to be created because the "Server is down".
Some providers also require a post on the LET forums as a confirmation.
I use 4.
PRTG , StatusCake, Uptime Robot, Nixstats
Ah, the old "The Customer Must Do All The Work" method. I can dig it.
nodeping + observium
This actually adds a level of sophistication because the urgency of problem can easily be judged by the number of pages the thread runs. One page is yellow, two pages is orange, and three pages is code red.
We use Pingdom and nixstats to monitor our servers.
@Nexhost
Uptime robot for hosts and Newrelic for servers
LibreNMS for internal use, NixStats for public/customers
NixStats, NewRelic Synthetics, Uptime Doctor - all four have free 1 minute monitoring from different locations.
smokeping, icinga, librenms
Zabbix, Observium or Nagios. For non critical services, NixStats.
Have been using Nixstats but got a HostUS box last month, playing with Nagios a lot & soon will move everything to Nagios.
cloudstats.me
I use @onepound 's external monitoring service (free 10 checks for clients).
I'm very impressed.
I use @vfuse nixstats
and LibreNMS. Nixstats sms notifications arent working for me, though, but it does a pretty amazing job.
OP is really asking for two different things.
This is arguably more capacity planning or APM than "monitoring". There's a whole subindustry devoted to this - what does "not optimally" mean? Is it something as crude as CPU load, or something more sophisticated like "number of milliseconds for the query to return" and if so are you instrumenting at every level of the stack - server, network, app, database, etc.
This is perhaps more configuration management than "monitoring". Depends what you mean by "upgrade". If you mean "has not run apt-get upgrade in six months" that's one thing; upgrade because the server is not performing well/is out of warranty/is CPU model X and that's too old/etc. that's different.
Lots of people in this thread mentioned external monitoring services. For example, @NodePing is great but they're on the outside...other solutions which have an agent are needed if you want things like "is this process down".
Low on resources and infinitely customisable: Xymon
NIXStats and New Relic.
Mainly LibreNMS and an open source PHP script to check UDP every minute. Useful since both have e-mail, SMS, and Pushover alerts.
Will bring back SMS messages asap, they ran out a bit fast last time (topped up $200 at twilio and lasted about 2 months)
Zabbix
I'm using new relic and librenms (moved from observium).
I'm guessing the sms notifications aren't free to end users?
https://www.amon.cx/
During beta everything is free.
I'm using Nixstats for a single site.