Pung the Provider

sleddog · July 2012

@jcaleb said: @prometeus said: I know that this might be difficult to code but it would be fantastic if you could add the ability to talk between two (or more ) instance of your script as part of the check sequence. I.e. the instance in chicago talk with another instance in london or los angeles to have confirmation of a down or severe packet loss...

One way I am thinking is, if its SQL-LITE based, then just git commit the db file at some interval. then a master one just pull the sql lite from diff location and consolidate.

I'd argue that this kind of approach introduces more complexity, and more complexity brings more potential for error. I wrote about it in the other monitoring thread: http://www.lowendtalk.com/discussion/comment/90513#Comment_90513

It's important I think to take the results of the pung monitor quite literally. At the moment, connections to the CVPS Buffalo target are timing out. That doesn't mean CVPS Buffalo is "down". It might be, or there might be a network issues between the Chicago monitor & CVPS Buffalo. A second (or third) monitoring station would answer that only if it took a completely different route to the CVPS Buffalo target (triangulation). But it's impossible to control routing for something like this.

So my preference to keep the monitoring as it is -- a single, simple point-to-point test -- and then investigate manually when an issue crops up.

prometeus · July 2012

@sleddog said: So my preference to keep the monitoring as it is -- a single, simple point-to-point test -- and then investigate manually when an issue crops up.

I understand, but a quorum between monitoring nodes would be a valuable plus

sleddog · July 2012

@prometeus said: I understand, but a quorum between monitoring nodes would be a valuable plus

Can you explain the benefit it would add?

jcaleb · July 2012

@prometeus said: I understand, but a quorum between monitoring nodes would be a valuable plus

how about a main website that pulls the info from different pung slaves, then display in different tabs per location. i.e. it doesnt consolidate per ping. just show 1 tab for the node in canada. 1 tab for node in LA. etc

prometeus · July 2012

@sleddog said: Can you explain the benefit it would add?

To limit false positive :-)

Zigga · July 2012

@sleddog said: Can you explain the benefit it would add?

Say a cable like SEA-ME-WE4 breaks, this cable works in sections. We know Bangladesh only has access to this cable(they have backup satellite links) how ever if your in Bangladesh your opinion is the internet is failing you, as where as if you have a slave in (say India) you know that not true. (terrible example)

When you talking about IP traffic its vital you have Src/Dst and 3rd monitor to verify Src/Dst isn't broken.

People who sell this type of information

sleddog · July 2012

@prometeus said: To limit false positive :-)

To me a false positive means that a test is reported as "OK" when it's not. The only way I can see that happening is by a bug in the pung app.

Maybe you mean false negative -- a test is reported as "timed out" when it's actually OK. To me, timed out means timed out -- remember the test is actually two attempts over a ~15 sec timespan (with a successful connection to google or twitter or facebook in between). So something went wrong. Still, I tend to discount single, isolated failures as insignificant / unimportant.

miTgiB · July 2012

@sleddog said: That doesn't mean CVPS Buffalo is "down".

With IPXcore reporting up and being in the same racks, at most the machine you are testing for in Buffalo could have an issue, but ChicagoVPS Buffalo is defiantly up.

sleddog · July 2012

@miTgiB said: With IPXcore reporting up and being in the same racks, at most the machine you are testing for in Buffalo could have an issue, but ChicagoVPS Buffalo is defiantly up.

By "CVPS Buffalo" I meant the CVPS Buffalo target (IP:port), not all services provided by CVPS in Buffalo.

prometeus · July 2012

Yes, I'm looking this from the side of the allert point of view. I would like a "Huston, we got a problem" message only when double checked/confirmed. But don't want to insist, your point to keep it simple is valid :P

jshinkle · July 2012

There just went my uptime. Still strong in Chicago though.

Ash_Hawkridge · July 2012

This is when you know pingdom is not always correct. It spotted a 5 minute downtime (Monitored every minute) last night, yet pung has reported no problems.

Whats the check interval @sleddog

sleddog · July 2012

@GetKVM_Ash said: Whats the check interval @sleddog

5 minutes. So it's just possible it was missed

Ash_Hawkridge · July 2012

Oh sorry, i just looked properly. It was only 4 seconds so pung probably missed it.

sleddog · July 2012

@GetKVM_Ash said: It was only 4 seconds

How can pingdom detect a 4 second outage when it checks every minute....?

Ash_Hawkridge · July 2012

LOL. I'm ashamed. I read the graphs wrong again, i didn't get much sleep last night :P

sleddog · July 2012

@GetKVM_Ash said: i didn't get much sleep last night

Pingdom kept you up?

Ash_Hawkridge · July 2012

It disturbed me yeah, then just as i was about to get up i got the second SMS "hypervisor01 is back online". Doh.

Sleeps never the same when you get disturbed

sleddog · July 2012

@GetKVM_Ash said: Sleeps never the same when you get disturbed

So true...

BTW, on the pung page at bottom you can now set your local time offset (from UTC). "Last run" time at top will then be within 5 minutes of your local time, and the log will be in your local time.

I changed the log time format to a unix timestamp for easier manipulation, that's why there's a new log.

Ash_Hawkridge · July 2012

Cool

It surprises me how fast the page loads with so many checks, our status page can take forever sometimes, but thats probably because we have a PHP check for each service on the same page :P

sleddog · July 2012

@GetKVM_Ash said: It surprises me how fast the page loads with so many checks

Checks are done independently with a bash script. The PHP page merely reads a couple text files and formats the display. If you happen to load the PHP page while the bash script is in progress you'll see 'Running...' at top in place of 'Last Run'.

exextatic · July 2012

@sleddog,
Messaged you about having two of our Xen VPS nodes added to the list (in France and Germany).

sleddog · July 2012

@exextatic said: Messaged you about having two of our Xen VPS nodes added to the list (in France and Germany).

Thanks exextatic. The list is comprised of hosts that have made offers here (or on LEB) and I'd like to keep that informal rule... maybe you'll consider making an offer?

Ash_Hawkridge · July 2012

Scheduled maintenance ruined our perfect record

exextatic · July 2012

@sleddog,
Waiting on Chief to post it on LEB, and for me to have a full 7 days registered here before I post it here

Brandon · July 2012

Thanks for sharing and I'm sure a lot of people will find it useful.

sleddog · August 2012

@jshinkle said: There just went my uptime. Still strong in Chicago though.

Deja vu... Got a different IP in Buffalo that responds on port 80?

Ash_Hawkridge · August 2012

Woop, looks like everybody got a reset. Let the uptime rivalry begin.

sleddog · August 2012

@GetKVM_Ash said: Woop, looks like everybody got a reset. Let the uptime rivalry begin.

It's August now

Ash_Hawkridge · August 2012

Ah reset every month, im with you.

Howdy, Stranger!

Categories

In this Discussion

Pung the Provider

Comments

Howdy, Stranger!

Quick Links

Categories

In this Discussion

Pung the Provider

Comments