Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!


IWstack outage - Page 2
New on LowEndTalk? Please Register and read our Community Rules.

All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.

IWstack outage

2»

Comments

  • netmannetman Member

    @Maounique said:
    Sorry for that, at this time HA is turned off until we manage to make sure something like this will not happen again.

    Mine were down too, until I received the mail about the incident and restarted it.

    Unfortunately the mail wasn't send until 4 hours after the situation had been resolved, so that meant another 4 hours of downtime for my instance.

    I'm a new customer, and this is a new server, so I hadn't set up any monitoring yet. Guess I need to get going on that...

  • kaflokaflo Member
    edited July 2014

    @netman said:
    the mail wasn't send until 4 hours after the situation had been resolved, so that meant another 4 hours of downtime for my instance

    8 hours of downtime?

    so much for high availability

    you may want to come down from the clouds now and back to the "ordinary" VPSes.

  • MaouniqueMaounique Host Rep, Veteran

    netman said: Unfortunately the mail wasn't send until 4 hours after the situation had been resolved,

    Not really, it has been sent when the HA was turned off, not 4 hours after that.

  • netmannetman Member

    @Maounique said:
    Not really, it has been sent when the HA was turned off, not 4 hours after that.

    Okay. Double checking the mail headers I can see now that I was confused by the many timezones, daylight savings etc. involved (includes a forwarding from the one you sent to), and that all hops stays within same minute:second (more or less).

    So you are probably right, and I'm wrong. Sorry about that.

    Thanked by 1Maounique
  • CrabCrab Member

    @Maounique Are you still facing problems, my instance has been horribly slow today?

  • Bah it was just @Maounique trying to do a runner but he forgot how again.

  • CrabCrab Member

    Could have been something on my side too. The loads went to 5+ without any reason, ssh connections were getting dropped etc. Rebooted the instance and everything has been smooth so far.

  • It's probably hundreds of VMs being booted up and set up after the downtime.

  • MaouniqueMaounique Host Rep, Veteran
    edited July 2014

    0xdragon said: It's probably hundreds of VMs being booted up and set up after the downtime.

    Not really, they either booted in the time before the announcement or were left down after when we turned off HA. It must have been a local issue, at this time there are no known problems in the whole infrastructure apart from the disabled HA. Actually, load is lower than usual which means either there are still some VMs down (cant be more than 20-30) or the restarts cleared internal issues with some instances.

  • MaouniqueMaounique Host Rep, Veteran
    edited July 2014

    Reason for stopping: Migrating to the same host.

    2014-07-05 02:04:32,880 DEBUG [cloud.capacity.CapacityManagerImpl] (AgentConnectTaskPool-199:null) VM state transitted from :Running to Running with event: AgentReportRunningvm's original host id: 45 new host id: 45 host id before state transition: 45
    

    This is a bug we are investigating.

  • mikhomikho Member, Host Rep

    @MarkTurner said:
    Raymii - make sure the machines are IDENTICAL, same mobo, same CPU version, same RAM type, etc. Drove me crazy about 18 months ago when we built this thing.

    That is one of the reasons to not build large as hell clusters.
    Very expensive to upgrade.
    We keep our clusters sized at 10-20 hosts.

Sign In or Register to comment.