Carrier1/Evocative/Prime (1515 Round Table, Dallas, TX) Power Outage

1of1servers · May 2024

Power has been restored, finall!

cubedata · May 2024

So has it been restored for Psychz Networks in Dallas?
as I checked and so far still no sign of power from my infrastructure.
All Uptime Monitoring Services(Uptimerobot, HetrixTools) still report there down.
I sometimes wondered if I should have bought a ASN to announce my ARIN /24 ipv4 block or not, but I am curious when this will be fixed.

zipservers · May 2024

We had to manually power up some of our boxes and dealing with FSCKS and things not coming back as they should have. Good news is the majority are back online and working fine.

Silvenga · May 2024

@cubedata said: So has it been restored for Psychz Networks in Dallas?

My Psychz quarter rack has been online for a couple hours now, according to my monitoring - although, I am seeing some network stability issues every-now-and-then.

My guess is some of their non-colo servers are still being booted/fsck'ed.

cubedata · May 2024

@Silvenga said:

@cubedata said: So has it been restored for Psychz Networks in Dallas?

My Psychz quarter rack has been online for a couple hours now, according to my monitoring - although, I am seeing some network stability issues every-now-and-then.

My guess is some of their non-colo servers are still being booted/fsck'ed.

Thank you, I just saw from my monitoring tools(HetrixTools/UptimeRobot) they just came back online.

I have a Half Rack with Psychz, I used to have a Quarter Rack from them and then upgraded.

I am worried now about doing a Firmware Upgrade on my Firewall now due to the power issue, so I will do that later.

babywhale · May 2024

i mean if it was an actual storm then its hard to stay online when there is no infrastructure...

Silvenga · May 2024

@babywhale said: i mean if it was an actual storm then its hard to stay online when there is no infrastructure...

That's why these data centers have backup generators. Not sure about Carrier-1 Dallas, but the other data centers I deal with have at least 3 days of power without utilities - with the ability to accept more fuel if needed.

Carrier-1 claims for their Dallas location:

8 Megawatts with multiple utility feeds, 4x 8500kVA Transformers, 5x 600kVA Vertiv UPS 2N+1, 2x 2 Megawatt Cummins Generators, 7000 Gallons of Diesel Storage On Site.

At the end of the day, it does look like something borked when they attempted to transfer off of battery backups to the generators.

georgedatacenter · May 2024

From tier.net

Dear Customer,At approximately 6:17am CST our NOC began to see alarms in our Dallas, TX POP. As we are still investigating the issue, we do not have a ETA for resolution yet, but hopefully it will be very soon. Meanwhile, your patience is appreciated. We will post updates shortly.

UPDATE: 6:35am CST: Upon further investigation, we suspect the entire facility at Carrier-1/Prime Datacenters in Dallas has gone dark. It is likely power related. We are awaiting updates from onsite.

UPDATE: 7:25AM CST: We have received confirmation from the facility that this is a power outage. They have confirmed that there is no physical damage to the facility from the storm in the area. The generators are running, however the UPS are not receiving any power from the generators. Facilities and generator contractors are enroute with an ETA of 30-45 minutes.

UPDATE: 9:02AM CST: Power has been restored to our POD and office in Dallas. Networking is back up and most servers are back online. If you still have servers or services down, please post a support ticket and we investigate with urgency.

UPDATE: 11AM CST: We are still working on scattered issues related to the earlier outage. Servers in cabinets Bi04 and DK12 are up but without proper networking. We are working urgently on this as well as remaining hosts that have been reported to us.

A full post mortem will be posted here and as an announcement once we have gathered all of the facts. The underlying issue was that despite redundant power systems that were online and functional, the facility may have had some type of malfunction with the automated transfer system (ATS). Powerful storms and tornados in Texas overnight caused deaths and destruction in the area but the facility should have remained unaffected even with extended loss of utility power, as it has before. These systems are indeed routinely tested. Please have no doubt that our team will ensure that all necessary investigations, and ultimately fixes, will be employed to avoid even the slightest chance of a repeat in the future. We appreciate your patience and understanding.

WSWD · May 2024

Only have 4 servers there. Two are back up. Two have damaged/failed RAID cards and will not boot.

mehargags · May 2024

My ImpactVPS Dallas VPS is up now... thankfully.
Learned about the devastating storm, wish everyone in the affected areas is safe

ExonHost · May 2024

All of our servers back online few hours ago. We have server with Hivelocity and TailorMadeServers.

aqua · May 2024

@jonbeard said:
@Hivelocity Did you go out there? In the Range Rover, Porsche, Corvette or Bentley? You always know when theres a big outage because the nice cars show up haha

Happy Sunday!

This reminded me of when AT&T went down and the entire parking lot at Infomart was filled with their trucks.

jonesolutions · May 2024

@WSWD said:
Only have 4 servers there. Two are back up. Two have damaged/failed RAID cards and will not boot.

One of the reasons why we preferred Software Raid instead of Hardware Raid.

WSWD · May 2024

@jonesolutions said:

@WSWD said:
Only have 4 servers there. Two are back up. Two have damaged/failed RAID cards and will not boot.

One of the reasons why we preferred Software Raid instead of Hardware Raid.

Yeah, they're older servers. Came to me with hardware RAID. I use software for everything these days. Don't even waste money on a card.

cyansmoker · May 2024

Serverdime/Hostrush is still down. Dunno who else either got their stuff completely ruined, or not paying attention.

readycheapcloud · May 2024

But still serverdime, Hostrush, Servercheap, are all down with their websites. Any updates do not come from them. Anyone have any updates from them?

emgh · May 2024

I don’t think I’ve ever had a power or internet outage. Maybe I should provide hosting.

emgh · May 2024

@jonbeard said:
@Hivelocity Did you go out there? In the Range Rover, Porsche, Corvette or Bentley? You always know when theres a big outage because the nice cars show up haha

Happy Sunday!

Those were all mine, sorry for taking up all of the space.

OhJohn · May 2024

Once had servers with Tier in Dallas (they use(d) Psychz or Carrier1 directly I think) and at that time those big snowstorms/blizzards were hitting Dallas. I was quite impressed at that time how the DC stayed online throughout these storm days as the Dallas region had mayor power outages if I remember correctly.

What is rather concerning this time is that practically every provider i can think of that has a location in Dallas is using the (now) Evocative DC. I mean, does Dallas not have any other DC providers?

What else could be used in Texas besides those providers in the Spring,TX DC?

Edit: Looking further it does seem as if e.g. USDedicated is using a different DC in Dallas but I'm not super impressed with GSL as upstream... but at least they seemed to be online yesterday.

readycheapcloud · May 2024

@cyansmoker said:
Serverdime/Hostrush is still down. Dunno who else either got their stuff completely ruined, or not paying attention.

Have any updates? 🤔🙄🙄🙄

host_c · May 2024

The Lone Star state has been hit by sever weather in the past days.

it seems it is still ongoing.

https://www.weather.gov/sjt/TexasSevereWeatherOutlooks

khajiit · May 2024

@readycheapcloud said:

@cyansmoker said:
Serverdime/Hostrush is still down. Dunno who else either got their stuff completely ruined, or not paying attention.

Have any updates? 🤔🙄🙄🙄

I emailed him but his email server was down until this morning. tried calling him and he called me back. very courteous and apologetic. told me to give him about an hour (for my use case) but it seems like maybe he is on site so that's good

xx00xx · May 2024

@OhJohn said:

What is rather concerning this time is that practically every provider i can think of that has a location in Dallas is using the (now) Evocative DC. I mean, does Dallas not have any other DC providers?

What else could be used in Texas besides those providers in the Spring,TX DC?

Edit: Looking further it does seem as if e.g. USDedicated is using a different DC in Dallas but I'm not super impressed with GSL as upstream... but at least they seemed to be online yesterday.

for colo you can look at tierpoint or qts

cyansmoker · May 2024

@khajiit said:

@readycheapcloud said:

@cyansmoker said:
Serverdime/Hostrush is still down. Dunno who else either got their stuff completely ruined, or not paying attention.

Have any updates? 🤔🙄🙄🙄

I emailed him but his email server was down until this morning. tried calling him and he called me back. very courteous and apologetic. told me to give him about an hour (for my use case) but it seems like maybe he is on site so that's good

I can't imagine. It looks like it's a one-man show. Not a criticism, I admire folks who run their business end-to-end. Just... Oh, the pressure.

Oldschool · May 2024

Sure hope prime has their shit together now considering the winds last night and the 600k power outage in Dallas right now.

JoshR · May 2024

@OhJohn said: What else could be used in Texas besides those providers in the Spring,TX DC?

There are other DCs here in Houston and Dallas. but if you want in those you will be paying a premium. On average those range around $99/m for 1RU.

Swiftnode · May 2024

We've been in this datacenter since January of 2015, there was a fiber cut that caused an outage, ticket logs show it was Feb 11, 2015. Since then, I do not recall any major outages (until this one) that effected the entire facility.

Even through super bad weather, and indirect effects of hurricanes like Harvey and Laura, the facility stayed up.

Personally I would not recommend jumping ship over this, at the end of the day, this facility wide outage was fairly short.

And if weather outages concern you, avoid Houston, I created a WHT thread back in the day when Harvey hit Houston hard and flooded the bayou and the basement of Softlayer.

Datacenters, no matter how fault tolerant or perfect they claim to be, will eventually have an outage. We saw this earlier this year in January with Equinix CHI1 5th floor due to chiller failures.

The only thing you can do is take backups, verify your backups actually work, and have plans to mitigate the downtime where/when possible.

aqua · May 2024

@Swiftnode said:
We've been in this datacenter since January of 2015, there was a fiber cut that caused an outage, ticket logs show it was Feb 11, 2015. Since then, I do not recall any major outages (until this one) that effected the entire facility.

Even through super bad weather, and indirect effects of hurricanes like Harvey and Laura, the facility stayed up.

Personally I would not recommend jumping ship over this, at the end of the day, this facility wide outage was fairly short.

And if weather outages concern you, avoid Houston, I created a WHT thread back in the day when Harvey hit Houston hard and flooded the bayou and the basement of Softlayer.

Datacenters, no matter how fault tolerant or perfect they claim to be, will eventually have an outage. We saw this earlier this year in January with Equinix CHI1 5th floor due to chiller failures.

The only thing you can do is take backups, verify your backups actually work, and have plans to mitigate the downtime where/when possible.

I think you're forgetting about when the CRAC's failed and DC2 was like a sauna on June 11th 2022. Or when the DC lost power on January 29th 2021 with the same issue. We moved out of that DC in March I believe and it was hands down the best decision we made.

Your hardware is not safe, anyone with a $15 key from Amazon can open your rack and mess with your hardware. If you have business critical hardware in there, I'd stay away.

Swiftnode · May 2024

@aqua said:
I think you're forgetting about when the CRAC's failed and DC2 was like a sauna on June 11th 2022.

On June 11th, at peak our recorded inlet ambient temps were only marginally above most standard operating temps (70C), and only for about 20 minutes before they started falling again.

And HVAC issues are going to happen in any datacenter. As stated in my previous post, look at Equinix CHI1 5th floor in January of this year, multiple cascading chiller failures, resulting in a very hot floor. Equinix was preventing people from entering for more than ~5 minutes at a time.

Would you tell people not to utilize Equinix CHI1? It's one of the most connected facilities on the planet.

@aqua said:
Or when the DC lost power on January 29th 2021

I don't have a notification or ticket for this date, and I don't see any outage on our monitoring for that date either, was this through a reseller or direct?

@aqua said:
Your hardware is not safe, anyone with a $15 key from Amazon can open your rack and mess with your hardware. If you have business critical hardware in there, I'd stay away.

I mean, you can say that about any datacenter with a shared access floor. If you need more security than what the shared floor offers, get an isolated cage/suite. Anyone I've ever sent to the datacenter was checked at the door for approval via ticket, it's not like they're walking in off the street and loading up machines.

aqua · May 2024

@Swiftnode said:
On June 11th, at peak our recorded inlet ambient temps were only marginally above most standard operating temps (70C), and only for about 20 minutes before they started falling again.

And HVAC issues are going to happen in any datacenter. As stated in my previous post, look at Equinix CHI1 5th floor in January of this year, multiple cascading chiller failures, resulting in a very hot floor. Equinix was preventing people from entering for more than ~5 minutes at a time.

Would you tell people not to utilize Equinix CHI1? It's one of the most connected facilities on the planet.

The CH1 chillers failed due to below freezing temperatures which ultimately froze over them. These are 2 entirely different instances as the Evocative case was due to improper maintenance to them.

@Swiftnode said:
I don't have a notification or ticket for this date, and I don't see any outage on our monitoring for that date either, was this through a reseller or direct?

We experienced this outage in DC1, but another company Nexril experienced it in DC2 aswell.

@Swiftnode said:
I mean, you can say that about any datacenter with a shared access floor. If you need more security than what the shared floor offers, get an isolated cage/suite. Anyone I've ever sent to the datacenter was checked at the door for approval via ticket, it's not like they're walking in off the street and loading up machines.

No, as you like to mention Equinix, they use a pin-pad locking system for shared space cabinets. There are no keys involved within these racks.

You can be a customer of one company and still have the ability to take down a competitors machines.

Pretty sure there was also a case of someone calling in and asking the DC to turn off the breakers on some cabinets (with no information requested) and it being done?

Howdy, Stranger!

Categories

In this Discussion

Carrier1/Evocative/Prime (1515 Round Table, Dallas, TX) Power Outage

Comments

Howdy, Stranger!

Quick Links

Categories

In this Discussion

Carrier1/Evocative/Prime (1515 Round Table, Dallas, TX) Power Outage

Comments