Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Sign In with OpenID
Advertise on LowEndTalk.com

In this Discussion

Monitoring

Monitoring

dominicldominicl Member
edited July 2012 in General

I've noticed some people are starting to set up monitoring networks for all of their VPS's. What do you guys use for monitoring your VPS's and what do you host it on?

Thanks

Comments

  • Anyone? what do you guys use for monitoring?

  • JarJar Member
    edited July 2012

    PHP Server Monitor Plus

    Toying with openstatus but I think I keep messing something up and the uptime field is blank. Right now I host it on one of the vps but I'm going to end up hosting it on my shared hosting server.

  • RobertRobert Member

    At risk of repeating myself from other threads, I use Zabbix. I have a dedicated server set up with it on. I plan on allowing free accounts once I finish setting up a self service portal in PHP. If anyone wants a free account, let me know. I'll have to add the hosts manually, but you'll have access to the monitoring dashboard and email alerts.

  • miTgiBmiTgiB Member
    Hostigation High Resource Hosting - SolusVM OpenVZ/KVM VPS
  • JackJack Member
    edited October 2012

    I use openstatus;

    http://status.xjack.pro

    However people prefer munin.

  • JackJack Member
    edited July 2012

    @miTgiB said: http://www.centreon.com/

    Their site has broken links.

  • works for me?

  • JackJack Member

    @dominicl said: works for me?

    http://www.centreon.com/Content-Download/donwload-centreon

    we recommend that you review Centreon's System Requirements

    Broken link.

  • @Jack said: I use openstatus;

    http://status.xjack.me

    When I click on a '1 hr' or '3hr' link (etc.) all I get is a spinner.

  • TaylorTaylor Member

    @Jack said: Broken link.

    >

    https://forums.dotvps.net/

    Brokenish link.

    I know, I'm Dale Maily.

    Thanked by 2miTgiB TheHackBox
  • @Jack said: Broken link.

    Ah, yea I did notice that. Sorry

  • JackJack Member

    @sleddog said: When I click on a '1 hr' or '3hr' link (etc.) all I get is a spinner.

    @nickmoeck can fix that I think it's a bug

  • JackJack Member

    @Taylor said: https://forums.dotvps.net/

    Brokenish link.

    @dominicl also told me about that because it's hurting you so much inside I shall go and spend the next hour installing vanilla on a VM Just for you.

    I hope you're going to activate if I put one up haha!

    @Taylor

  • TaylorTaylor Member
    edited July 2012

    @Jack said: I hope you're going to activate if I put one up haha!

    >

    I may be, if I get a MB worth of VPS every post :P

    I know, I'm Dale Maily.

  • JackJack Member
    edited July 2012

    @Taylor said: I may be, if I get a MB worth of VPS every post :P

    Hmm..

    Anyway get back on topic!

  • TazTaz Disabled

    Status2k.

    Time is good and also bad. Life is short and that is sad. Dont worry be happy thats my style. No matter what happens i won't lose my smile!

  • sleddogsleddog Member
    edited July 2012

    @dominicl said: What do you guys use for monitoring your VPS's

    Just a thought... "monitoring" is different things....

    You can monitor availability, e.g., port 80 on server xyz, and track its status over time (as a % uptime/downtime).

    Or you can monitor performance, e.g., load, memory, swap, and track those variables over time.

    Some apps do one, others do the other, I guess some do both :)

    So when you say "monitor" you need to think about what it is you're concerned about monitoring....

  • JackJack Member

    @NinjaHawk said: Status2k

    looks meh.

  • justinbjustinb Member
    edited July 2012

    http://observium.org/wiki/Main_Page for performance pingdom on all hosts for uptime/latency

    Postgres

  • miTgiBmiTgiB Member

    @justinb said: pingdom on all hosts for uptime/latency

    Only if you like false results

    Hostigation High Resource Hosting - SolusVM OpenVZ/KVM VPS
  • JarJar Member

    @miTgiB Lets be fair. Pingdom is extremely accurate. Until it isn't.

  • @miTgiB said: Only if you like false results

    Haven't gotten a false result yet.. it's a free service anyways

    Postgres

  • miTgiBmiTgiB Member

    @jarland said: Pingdom is extremely accurate. Until it isn't.

    I have zero faith in it. I don't know how many tickets I get with people that depend on pingdom and claiming I was down when there was no issue. Occasionally pingdom gets lucky with a correct down report, but they are rare.

    Hostigation High Resource Hosting - SolusVM OpenVZ/KVM VPS
  • KuJoeKuJoe Member
    edited July 2012

    External Monitoring: BinaryCanary (waiting for our new server for a custom monitoring script) and scrd+status+munin (results). Internal Monitoring: scrd+status+munin (results), custom monitoring script, and Observium.

    -Joe @ SecureDragon - LEB's Powered by Wyvern in FL, CO, CA, IL, NJ, GA, TX, and AZ
    Test our network here: Drgn.biz
  • camargcamarg Member

    I use cacti along with nagios for some servers.

    @miTgiB said: people

    maybe that's the problem :)

  • Demo of my availability monitor: http://199.96.82.38/pung/

  • JackJack Member

    @sleddog looks nice how do i get it?

  • @Jack said: @sleddog looks nice how do i get it?

    +1, thats some nice work.

  • @GetKVM_Ash said: @Jack said: @sleddog looks nice how do i get it?

    comment Inception.

    Anywyas yeah dude that looks awesome.... Are you going to open source that? ;)

    Catalyst Host - Pie Approved!
  • sleddogsleddog Member
    edited July 2012

    Yes, OSS, WIR :) I originally write it ~6 years ago for internal use. I thought I'd clean it up a bit for public use, but the "clean up" turned into a rewrite.

    Meanwhile, guess who monitors for tcp connections with no data, and eventually blocks for an hour? I'm guessing WHT comes up again around 15:14 NDT :)

  • @miTgiB said: I have zero faith in it. I don't know how many tickets I get with people that depend on pingdom and claiming I was down when there was no issue. Occasionally pingdom gets lucky with a correct down report, but they are rare.

    We are finding this increasingly, customers just assume that when pingdom says its down that its actually down. When in reality there site is still online.

    @KuJoe said: BinaryCanary

    We are using BinaryCanery too, finding it much more reliable than Pingdom! Then we have our internal monitoring system, teamed with Cacti and Nagios.

    LoveVPS - 2GB RAM - 25 GB RAID 10 Spring Sale from $7.00/mo - We provide KVM Virtual Servers with love!

  • JarJar Member

    Still not one false report from uptime robot in a year ;)

  • I re-started my demo/test monitor, targeting providers' test IPs from the Offers forum. I'm interested in reducing/eliminating false positives, and I thought this might help -- with confirmation/denial from the provider regarding any downtime :)

    http://199.96.82.38/pung/

  • gsrdgrdghdgsrdgrdghd Member without signature

    @sleddog said: I'm interested in reducing/eliminating false positives

    Maybe you should add distributed monitoring for that

  • @gsrdgrdghd said: Maybe you should add distributed monitoring for that

    No. it would add little or no benefit, and be significantly more complex, which in turn creates a greater probability of errors.

  • I see one of our IPs there, thank you @sleddog

  • gsrdgrdghdgsrdgrdghd Member without signature

    @sleddog said: it would add little or no benefit

    Hows that?

  • sleddogsleddog Member
    edited July 2012

    @gsrdgrdghd said: Hows that?

    People generally look at scripts like this as an "uptime" monitor. It isn't, and I don't :) It's a point-to-point connection monitor. It tries to establish a tcp/ip connection across the Internet from Point A to multiple targets (Points B) on designated ports, and records the results.

    In my script, an attempted connection can have one of three possible results:

    1. Connection succeeded
    2. Connection refused
    3. Connection timed out

    The meaning of "Connection succeeded" is obvious. "Connection refused" means that the target was reached but it refused the connection. This could be because a listening service has stopped (e.g., apache has crashed) or a firewall has rejected the connection.

    "Connection timed out" means that the target offered no response. This is the most problematic. It could be because:

    A. The target -- Point B -- is offline, or: B. There is a networking issue somewhere on the route from Point A to Point B.

    Obviously there's a third possibility:

    C. There's a localized issue causing Point A to have lost Internet access.

    But the script checks for that, logs it if it exists and exits quietly.

    Adding one or more monitoring stations ("distributed" monitoring) might help clarify if the issue is B above, but only if the additional stations take different routes to Point B, and the issue doesn't lie along those alternate routes. But frankly this is something I'd prefer to investigate manually. Frequently-repeated or extended timed-outs say, "look into it but don't make assumptions" :)

    Again, it's not an "uptime" monitor. It's a point-to-point connection monitor with history. A red dot doesn't neccesarily mean "OMG It's Down!" It means there was an issue establishing a connection between two points.

    Of course monitoring won't happen (and may not be available for viewing via www) if the monitoring station (Point A) is down. That's why I put it either on a remote LEB with respectable uptime / network uptime (say at least 99%), or on a local box that I manage.

Sign In or Register to comment.