Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!


Did BuyVM just go down? - Page 2
New on LowEndTalk? Please Register and read our Community Rules.

All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.

Did BuyVM just go down?

2

Comments

  • marcmmarcm Member

    @qps maybe you should let them know that their web site is down due to PHP not working.

  • @Zeno said:
    nod: lv-storage04 and lv-kvm11 still down for now, but http://buyvmstatus.com/ told me it's up? ( check node:storage-lv-04 )

    Probably the internal network is up but the external one is not ready yet?

  • tommytommy Member

    fiberhub.com No route to host :P

  • Still not up, I can access the manage server's area but I cannot boot my machine.

  • FranciscoFrancisco Top Host, Host Rep, Veteran

    The only nodes down at this point are a half dozen KVM's. 90%+ of our OVZ's were up within 45 minutes (the outage was 20 minutes or so) and a few nodes required a fsck afterwards.

    Storage nodes required a little extra work since MDADM didn't save the RAID50. For those that don't remember, we do 2 HW raid 5's then 0 it together in the OS.

    I'm working with FH right now to see what's up with the last of the KVM's.

    Here is the 'basic' RFO that Natalie gave me:

    Hello Francisco,

    Thank you for your patience, we will have a tech work on your equipment shortly.
    Our facility experienced a partial utility power failure this afternoon.
    Because the failure was only partial and not a complete loss of utility power,
    our ATS system failed to automatically switch us over from battery to generator
    power. We subsequently were able to locate the issue and transferred power to
    our backup generator manually, but unfortunately our UPS system had run out of
    capacity before we could complete manual transfer. Our electrical contractors
    are on site now and are working to determine when it will be safe for us to
    return to utility power. This has also caused us to have some routing issues
    which we have our network engineer working to resolve currently.

    We appreciate your patience, and are working diligently to restore 100% service.
    An RFO will be released as soon as available.

    Thank you,

    Francisco

  • jbilohjbiloh Administrator, Veteran
    edited June 2013

    BuyVM is still down? Jeez I feel bad for them and everyone else impacted.

  • @jbiloh said:
    BuyVM is still down?

    Did you read Fran's post?

  • jbilohjbiloh Administrator, Veteran

    I hadn't refreshed :-p

  • FranciscoFrancisco Top Host, Host Rep, Veteran

    Nodes were up quick as I said (< 45 minutes for the OVZ's). Storage took me a bit longer since I had to reassemble arrays.

    Right now the only thing pending is a few KVM nodes.

    The "positive" I guess is that 2.6.32 got rolled out on all nodes. We had been planning an OVZ wide upgrade in the coming weeks but guess we won't need that now :P

    Enjoy the vswap,

    Francisco

  • jcalebjcaleb Member

    Boss, it .32 stable already based on your testing?

  • PaulPaul Member

    So June is probably a downtime month. Lots of servers going down from hacks, electrical failures, etc. Whewww...

  • ZenoZeno Member

    @Francisco said:
    The "positive" I guess is that 2.6.32 got rolled out on all nodes. We had been planning an OVZ wide upgrade in the coming weeks but guess we won't need that now :P

    Enjoy the vswap,

    Francisco

    WOW! vswap, be quick, I need it!

  • jon617jon617 Veteran

    @Francisco said:
    The only nodes down at this point are a half dozen KVM's

    Looks like mine is one of those half a dozen KVMs. Down for 6 hours now.

  • ZenoZeno Member

    @jon617 said:
    Looks like mine is one of those half a dozen KVMs. Down for 6 hours now.

    me too, I have two KVM on lv-storage04 and lv-kvm11

  • WoopWoop Member

    Has anyones node gone live since the outage? Mine's been down all day :(

  • vpsboard is offline too. Maybe time to stop CC rants. If I compare the uptimes of Buffalo and Las Vegas Fiberhub does not shine.

  • @wlanboy said:
    vpsboard is offline too. Maybe time to stop CC rants. If I compare the uptimes of Buffalo and Las Vegas Fiberhub does not shine.

    Fiberhub does indeed seem worse than the SJ location they've had before. Routing issues are one thing, but this definitely isn't the first power outage in firehub...

  • but this definitely isn't the first power outage in firehub...

    3rd in 4 months
    https://my.frantech.ca/announcements.php?id=140
    https://my.frantech.ca/announcements.php?id=141

  • One would think they'd set it up so that in the event that the UPS power levels begin draining, the backup generators would be kicked on.

  • FranciscoFrancisco Top Host, Host Rep, Veteran

    All but 6 KVM's were down for this extended time.

    You're right, FH has been getting on our nerves with the power stuff. We figued Feb was the end of it and have been quite happy. Todays episode was resolved quick for most people but still, It shouldn't have happened.

    They've been working slowly on getting B feeds in place like we had been planning since Feb. I'm fairly sure that if the B feeds were in we would have felt a short network blip and get past. Power was the #1 thing we brought up when we were checking them out and they assured us all would be well. Prior to these last few months where they've been growing their setup things were quite solid.

    I don't doubt they'll address it all, I just wish it wouldn't keep kicking us in the groin :P

    Francisco

  • letboxletbox Member, Patron Provider

    @Francisco hope everything ok with you and my best luck.

  • Such a shame, Seems like someone will have to move out rather soon other's imagine more power trouble..

  • FranciscoFrancisco Top Host, Host Rep, Veteran

    @Conn8ct said:
    Such a shame, Seems like someone will have to move out rather soon other's imagine more power trouble..

    No, we're simply getting A+B feeds in place like we were planning this whole time. It's not an easy task to setup so it has taken a lot of time for them to tie it up. Rob has some other fancy stuff on the way, he said he'll document it all in his RFO due by the weekend I hope. :)

    All is well otherwise. Stallion 2 is pending a single import script and node upgrades are coming in August. 2.6.32 finally rolled out to all of LV (granted not in the way we originally planned). When we go in August we'll install the ATS units then and get all of our power things addressed in one go.

    The setup that's going for B side is likely the same size as A, but the amount of clients on the B side is going to be quite low since most people won't want to fork for the cost of A+B feeds.

    It's frustrating but it's also hard to find reasonably priced DC's that will resolve issues they have. Portland was a level of fucked I won't bother describing. FMT had lots of power issues and HE never documented to anyone what they changed. Did they fix FMT1? Who knows. We had tons of long network outages because HE had all of China DDoSing them. The DDOS was a monthly thing.

    Coresite had us twice in a position of no generators for weeks but thankfully they weren't needed. Top that with the constant problems we had with the network that last till the day we left. 8 months of watching EGI chase their own tail.

    FH has a very stable network with the only issue being our own router is finally topped out (we hit almost 2x higher in FH in peak usage as we did in SJ on the same hardware). As I said, I have no doubt Rob & Don will plan out a really solid setup. Top that with staff that actually care about what's offered. Every time we have anything we need done (reboots, builds, network tweaks, etc) they handle quickly. I think they just have had a large surge of sales (I don't think they have anymore cages available) and have had to scale what they have.

    With the feature set Stallion 2 will bring A+B feed is going to be a requirement. A power outage with load balancers, anycast, etc, all involved could turn into the most brutal mess to cleanup.

    Francisco

  • ZenoZeno Member

    @Francisco Node:lv-kvm11 still down for now? I can't get my VPS online

  • ZenoZeno Member
    edited June 2013

    @Jack I do it several times:

    27/06/2013 03:06 AM Boot x.x.x.x Complete

    27/06/2013 03:06 AM Hard Power Off x.x.x.x Complete

    27/06/2013 12:10 AM Boot x.x.x.x Complete

    27/06/2013 12:10 AM Hard Power Off x.x.x.x Complete

    26/06/2013 11:16 PM Boot x.x.x.x Complete

    26/06/2013 11:16 PM Hard Power Off x.x.x.x Complete

    26/06/2013 06:13 PM Boot x.x.x.x Complete

  • FranciscoFrancisco Top Host, Host Rep, Veteran

    Log a ticket and we'll check it out :)

    Francisco

  • Daniel15Daniel15 Veteran
    edited June 2013

    The "positive" I guess is that 2.6.32 got rolled out on all nodes.

    My VPS is showing "3.2.0-042stab076.8"... Is that actually 2.6.32 and not 3.2.0?

  • jon617jon617 Veteran

    My KVM VPS is back online.

  • pcanpcan Member
    edited June 2013

    @Magiobiwan said: One would think they'd set it up so that in the event that the UPS power levels begin draining, the backup generators would be kicked on.

    A high power circuit does not work this way. You can't blindly transfer lots of power from one system to another: if something goes wrong, bad things will happen. ATS systems have safety interlocks to enable transfer, and they can have issues: if something does not match, the sequence is aborted and manual intervention is needed. A good ATS is anything but trivial, both during design and at field installation, and therefore is expensive and need to be constantly mantained. Any "shortcut" to save some money will lead to missing response on some fault paths (such as not recognizing a single phase fault on a three phase system, or having a fuse "in the wrong place"). When I worked for a generator control systems manufacturer, the main failures complained by end customers where due to maintenance neglect: the Diesel motor starting too late or not starting at all because it was not exercised according to manufacturer specification, and power switch failures due to overload (you can overload the ATS transfer and the load will keep going, but the electrical contacts will "fuse" togheter and will not operate when needed). This is not the case on generators rated for life support, because there is a infrastructure in place to force checks and audits; but on commercial applications usually the cheapest route is followed, often behind the end-user back.

Sign In or Register to comment.