Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!


[RESOLVED] Lunanode Toronto down
New on LowEndTalk? Please Register and read our Community Rules.

All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.

[RESOLVED] Lunanode Toronto down

sb56637sb56637 Member
edited July 2020 in Outages

Currently an outage is affecting Lunanode (lunanode.com website and VMs hosted in Toronto).


Update: https://status.lunanode.com/

Update 10 (06 July 2020 14:20 EDT): all three hypervisors and volume storage system remain offline. We continue to work to resolve the issues but do not currently have ETA on resolution.

Update 9 (06 July 2020 14:00 EDT): three hypervisors remain offline due to hardware failure after power surge: ceac64db9351, cbec3404b692, and b1fda546d1e3. Volume storage system remains impacted and may show timeout operations on some volumes. If your service is not on one of those hypervisors and does not use volumes, but remains offline, please open support ticket.

Update 8 (06 July 2020 13:50 EDT): if you see your VM is online in VNC but not reachable, please try restart, and if it does not work please open ticket.

Update 7 (06 July 2020 13:16 EDT): we have resolved the issue on one of the three failed storage nodes and it is coming online now. We are working on the other two nodes but the first one should be sufficient to bring entire storage system online.

Update 6 (06 July 2020 12:48 EDT): most services are back online but three storage nodes are offline which means distributed storage system is offline so VMs with volume cannot be booted until we resolve this issue.

Update 5 (06 July 2020 11:50 EDT): controller node is booted.

Update 4 (06 July 2020 11:40 EDT): no replacement needed after removing serial port connection. ETA 10 minutes until most services except three hypervisors are online.

Update 3 (06 July 2020 11:35 EDT): some servers are damaged due to power surge. We need to replace it with backup server.

Update 2 (06 July 2020 11:05 EDT): PDU fail due to datacenter power surge. Replace and now services coming back online soon.

Update 1 (06 July 2020 10:20 EDT): appears to be power issue. Our technicians arrive on-site shortly.

Many services in Toronto are down. We are investigating.

Thanked by 2bdl sayem314

Comments

  • imokimok Member

    Waiting for Toronto to come back :)

  • Ouch, they got hit pretty hard by that power surge. Thanks to @perennate for the hard work and immediate response.

    Thanked by 1imok
  • imokimok Member

    VMs are online but outside network is disconnected.

  • @imok said: VMs are online but outside network is disconnected.

    So no data loss, apparently. That's the good part.

    I wonder if their routers got damaged too? Or some other piece of networking equipment at the datacenter outside of Lunanode's control?

  • sb56637sb56637 Member
    edited July 2020

    Update 6 (06 July 2020 12:48 EDT): most services are back online but three storage nodes are offline which means distributed storage system is offline so VMs with volume cannot be booted until we resolve this issue.

    @perennate It appears that a router or something is still offline, because my VM is up but still not accessible except via the VNC console. lunanode.com is also still inaccessible.
    Thanks for your fast response.

  • sb56637sb56637 Member
    edited July 2020

    @perennate Something is definitely wrong with the connectivity, my VM is sporadically on and offline now.

    Thanked by 1umi
  • @sb56637 said:
    @perennate Something is definitely wrong with the connectivity, my VM is sporadically on and offline now.

    Replying to this thread constantly isn't going to help, take note of 'most services are back online' which straight away indicates there are still some problems, which is to be expected. Give them chance to resolve the issue properly and keep an eye on the status updates, that's all you can do.

  • umiumi Member

    This thread is helpful as people can wait with switching back to Toronto and operate from their backup nodes so far. My Toronto nodes are still yellow and not connected to the internet.

    Thanked by 1imok
  • umiumi Member

    @sb56637 said:

    Update 6 (06 July 2020 12:48 EDT): most services are back online but three storage nodes are offline which means distributed storage system is offline so VMs with volume cannot be booted until we resolve this issue.

    @perennate It appears that a router or something is still offline, because my VM is up but still not accessible except via the VNC console. lunanode.com is also still inaccessible.
    Thanks for your fast response.

    https://dynamic.lunanode.com/panel/ is working

  • For awhile my VM was up, but not accessible via IPv4, although IPv6 was working. Now both IPv4 and IPv6 are working correctly, and lunanode.com is back up.

    The Panel never went down, it must be on a redundant infrastructure. Their ticketing system and some other functions inside the panel were broken though. Currently snapshots and volumes are still not working, as explained.

    Thanked by 1skorous
  • umiumi Member

    Not so lucky with b1fda546d1e3.

  • @umi said:
    Not so lucky with b1fda546d1e3.

    Bummer!!! I hope your data is still OK. Would a power surge normally fry an SSD?

  • umiumi Member
    edited July 2020

    Websites are working fine from a backup node on another provider except some very fresh data. Power surges can do many nasty things to motherboard and components.

  • umiumi Member

    All is back! Checking that everything works before switching websites back to Toronto. Thanks!

  • imokimok Member

    What a day for them

  • @imok said:
    What a day for them

    No kidding. They handled it super well though, kudos to Lunanode.

  • vyas11vyas11 Member

    Email updates and a credit offered!
    Brilliantly done.
    Thanks @perennate

    In contrast:
    Yesterday another provider had a hardware upgrade, the 1 hour announced downtime went to 4 plus hours. When I raised a ticket, I was asked to check on the status page (which only showed red/amber/green colours).

    Know now which provider to renew with!

  • Kudos @perennate for the way that was handled! :-)

    @vyas11 said:
    Know now which provider to renew with!

    100%

  • Damnit surge protector, you had one job to do!

    Thanked by 3dahartigan bdl sb56637
  • IMG_0576

    It’s not fixed yet. I’m getting downtime alert multiple times every hour for the whole day ☹️

  • perennateperennate Member, Host Rep

    @sayem314 the outage yesterday was resolved and only affected Toronto. If you have an ongoing issue then please open a support ticket. We are not seeing packet loss in Toronto, Montreal, or Roubaix at this time.

  • It was actually an issue with my application (memory leak causing the app to crash and restart again after few minutes) 😪 Opened a ticket and Lunanode was very fast and helpful. Appreciate the support 🙏

    Thanked by 1lokuzard
Sign In or Register to comment.