Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!


Delimiter Atom Down
New on LowEndTalk? Please Register and read our Community Rules.

All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.

Delimiter Atom Down

My Atom has been down for ~6 hours. I opened a ticket and was told to check Network Status which hasn't been updated since !!:18am EST that simply says investigating. I'm wondering how wide spread this issue is. Anyone else have one down?

Comments

  • SSDBlazeSSDBlaze Member, Host Rep

    6 hours isn't horrible,

    Im sure they will have it up within 12 hours. Usually providers don't let their services down longer than that.

  • MikePTMikePT Moderator, Patron Provider, Veteran

    @MarkTurner will chime in! :)

  • It came up literally after I clicked post! The power of LowEndTalk I guess.

  • SSDBlazeSSDBlaze Member, Host Rep

    @klpowell said:
    It came up literally after I clicked post! The power of LowEndTalk I guess.

    haha.

    Yep, LET has supernatural powers.

  • NomadNomad Member

    What about the power of impatience?

    6 hours is way too quick especially when they told there's a network related issue.

    Thanked by 1netomx
  • SSDBlazeSSDBlaze Member, Host Rep

    @Nomad said:
    What about the power of impatience?

    6 hours is way too quick especially when they told there's a network related issue.

    6 Hours can feel really long if you planned your day to be doing work on it.

    You have a point though, I'd say 12 hours is the time to start posting.

    Thanked by 1Scottsman
  • They didn't tell anything. They said "investigating." I'm ok with downtime as long as there is communication. Not updating a Network Status page and answer a ticket with "read the network Status page." Is not communication. Six hours is the most downtime I have had with any of the providers I use for quite some time.

  • Which btw Network status page still not updated.

    Atla03 - Atlanta - Some Atom Servers offline (Reported)

    Affecting Server - Yomura Master Provisioning | Priority - Critical

    We're investigating loss of connectivity for some Atom servers in our Atlanta datacentre. 
    

    Date - 04/24/2015 11:17

    Last Updated - 04/24/2015 11:18

  • NomadNomad Member

    So... It's still under investigation right?

    "I've no further questions your honor"

  • Been up for 30 min so who knows...

    Listen, I have been extremely happy with Delimiter. This little Atom has far exceeded my expectations. But, my point here is why is communication so difficult for companies in the low-end market? In my professional career working in an IT dept for a Fortune 500 corporation this was always one of my top priorities, take a moment and update your clients on the status of their issue. I've had clients that asked for an update ever 15 minutes, was that a pain? Yes, Did that lead to a slow down in restoration? Yes, but was the client happy? Yes. In my own hosting career I have always taken the time to update clients when there is an issue. I don't care if they paid me $1 or $1,000 for the service, it isn't to hard to update. A "Still working on it" gives customers a much greater ease that their problem is being taken care. Over 6 hours, I would hope to have at least 2 updates on a Network downtime. This is not anything against Delimiter, but just an industry issue. I've had much higher dollar host than this that has failed at communication as well.

  • Its was not Delimiter's fault but our fault. They are normally very good at disseminating information via their network issues page and their three push-to-handset services.

    We have only given them limited updates today as we have been waiting for Force10's engineering to help handle a replacement card. Unfortunately the spare card that was used to replace the failed one had the wrong firmware. Force10 is especially pernickety when it comes to individual card firmware upgrades.

    Its often easier to leave Force10 to handle these types of jobs, otherwise we'd have another 100+ ports down which would be another calamity.

    Anyway the card was upgraded in an unused chassis finally and then installed back into the aggregate switch used for the rack with those Atoms in it.

  • Thanks @MarkTurner sounds like a fun day! Seems to be working find on my end now.

  • BruceBruce Member
    edited April 2015

    one of my atoms is still down. might be some residual issues in the DC

    update: all good now

  • Mine had an odd reboot after it came up, but has been stable since then.

  • lazytlazyt Member

    Guess mine was in that rack as well. It kind of shocked me when the monitoring showed it was down.

  • Rebooted 7:55 hours ago, but fine now.

  • BruceBruce Member
    edited April 2015

    anyone else have problems with their atoms not booting automatically?

    this is a list of outages reported on my atom nodes @ Delimiter (past 12 months).

    2015-04-24 17:39:36  
    2015-04-18 12:36:50  
    2014-10-29 12:52:29  
    2014-10-26 00:46:44  
    2014-10-19 11:10:51 *
    2014-09-27 11:23:27 *
    2014-09-17 21:00:02  
    2014-09-16 13:21:12  
    2014-09-03 22:13:27  
    2014-08-28 19:58:41 *
    2014-07-18 12:58:35 *
    2014-07-14 21:38:25  
    
     * not all nodes affected
    

    some of these were nodequery issues. some were brief transit issues. but sometimes all my atoms went down. when that happened they needed to be power cycled. different to my blades in the same DC, which I've never had to power cycle.

    anyone else have to power cycle their atoms after an outage? if we all have similar experience, then maybe it's something Delimiter can tweak (or perhaps have just resolved with a new card)

    Note: not a complaint. very happy with these $5/m servers. just interested if others have same "feature"

  • I was going to renew my Atom server but then it was down not once, not twice, not even three times, four times this week. Besides that though, I didn't mind the service and their customer support wasn't bad either. The other thing too though is a dedicated server was too much for what I was running, so I downsized to a VPS instead.

    @Bruce said:

    anyone else have to power cycle their atoms after an outage? if we all have similar experience, then maybe it's something Delimiter can tweak (or perhaps have just resolved with a new card)

    I believe Delimiter uses SAN for their Atom servers and because of that, whenever a drive in the RAID array fails and is replaced, the filesystem becomes readonly until the OS is rebooted. So I wouldn't call it an "outage". There's also pros and cons to have everyone's servers power cycled automatically.

  • lazytlazyt Member

    I've had one time where I had to power cycle my Atom there. Other then that it has worked surprisingly well.

    For the price I would rather use it then a VPS. It's handling a couple of moderate sized forums quite well.

  • ub3rstar said: whenever a drive in the RAID array fails and is replaced, the filesystem becomes readonly until the OS is rebooted

    That defies the definition of a RAID and completely untrue.

    The NAS used is a NetApp storage system. These units are built for enterprise level reliability and have the price tag to prove it.

    The issue that bites these Atom servers is the ISCSI implementation. This last outage which has been the first for nearly a year was caused by a switch card failure. These things happen, they get replaced and things trundle on.

    Of course along the way a few atoms would go offline from time to time without an inherent reason, often just ISCSI daemon on the server failure, OOM killing iscsid, reloading iptables and not ensuring that iscsi ports were left open, processor saturation causing ISCSI to fallover and so on.

    The culprit in 99% of cases is iscsid falling over.

  • and have the price tag to prove it.

    Eh, not really, in my exp Dell and HP worked far better for much less money.

  • BruceBruce Member

    @MarkTurner said:
    The culprit in 99% of cases is iscsid falling over.

    any suggestions on the best way to deal with this? my atoms stop responding, so power cycle is the only solution at the moment. if they are alive, but no disk, is it a matter of running a cron to check status and restart the service if not?

  • @Bruce - if you restart iscsid then you'll turn your filesystem read only.

    From my experience with two of them over the past 18 months, Centos 6 seems a lot more robust than Ubuntu for ISCSI root. This is not a scientific observation but just a comparison of two Atoms on the same pod, same rack, same switch, etc but one with Centos 6 and the other Ubuntu 12.04.

Sign In or Register to comment.