Shitty SuperMicro

randvegeta · March 2018

@Clouvider said:
Uh, great, yet another rant.

Another worthless and unprofessional reply eh? If you have nothing to add, why respond? The question asked was about alternative manufacturer that make standard form factor boards.

@deank said:
Yeah, this is getting a little old. Just how many PMS do you get in a month?

Once, but it lasts 30 days at a time. Technically that means in March I'll get it twice since there just aren't enough days in February!

@bsdguy said:

@LTniger said:
Fujitsu Primergy is another option to consider if you are going for brands...

Yes. Only had good experience with those.

Interesting. I'll check them out.

Clouvider · March 2018

@randvegeta said:

@Clouvider said:
Uh, great, yet another rant.

Another worthless and unprofessional reply eh? If you have nothing to add, why respond? The question asked was about alternative manufacturer that make standard form factor boards.

You seem to focus more on the rant less on the alternatives.

In any case good luck & good night.

randvegeta · March 2018

@leapswitch,

Just out of curiosity, the problems you had with the Micrcloud back plane, exactly what problems are you seeing?

I'm seeing 2 problems.

1.) Sliding in/out the nodes from the back SOMETIMES cause all nodes to restart.

2.) Some of the hotswap drive bays stop working. For example we have 4 chassis with 8 nodes, and in 1 of the chassis, there are 2 bays (belonging to 1 node) that doesn't work at all. Basically it makes the nods almost unusable since you cannot attach an SSD or HDD.

Do you see similar problems? Or what kind of problems do you see if different to the above?

leapswitch · March 2018

We are only seeing problem 2. 1 node, both bays stop working. Moving that node into a different slot works. After replacing backplane, the 2 bays start working. After a few months, the same bays fail again. On a couple of Microclouds, different bays failed. On 2 Microclouds, we replaced entire chassis and even after that 2 bays have failed.

Regardless of sample size, it seriously affects business when you need a 1 hour downtime for 7 other customers. We now have 1 chassis spare so we can simply move nodes from an affected microcloud to it while backplanes are being replaced.

In 1U / 2U servers, we have had 1 fan failure and 1 motherboard failure across 400+ servers in USA and India.

Since last year, we have also started purchasing large lots of ASUS servers and have had a good experience with the distributor.

randvegeta · March 2018

leapswitch said: We are only seeing problem 2. 1 node, both bays stop working. Moving that node into a different slot works. After replacing backplane, the 2 bays start working. After a few months, the same bays fail again. On a couple of Microclouds, different bays failed. On 2 Microclouds, we replaced entire chassis and even after that 2 bays have failed.

Are the bays the same on all the chassis? So far, all my failed bays are on the 7th node (bays 13 and 14).

leapswitch said: Regardless of sample size, it seriously affects business when you need a 1 hour downtime for 7 other customers. We now have 1 chassis spare so we can simply move nodes from an affected microcloud to it while backplanes are being replaced.

Indeed. Not to mention the cost. Our supplier requires we return the faulty part to check to RMA before they give us a new one. So we need to BUY a new one before we can get the old one fixed. Which is ridiculous! Which is why I have said that the warranty is practically worthless.

leapswitch · March 2018

@randvegeta said:

leapswitch said: We are only seeing problem 2. 1 node, both bays stop working. Moving that node into a different slot works. After replacing backplane, the 2 bays start working. After a few months, the same bays fail again. On a couple of Microclouds, different bays failed. On 2 Microclouds, we replaced entire chassis and even after that 2 bays have failed.

Are the bays the same on all the chassis? So far, all my failed bays are on the 7th node (bays 13 and 14).

No, this varies.

leapswitch said: Regardless of sample size, it seriously affects business when you need a 1 hour downtime for 7 other customers. We now have 1 chassis spare so we can simply move nodes from an affected microcloud to it while backplanes are being replaced.

Indeed. Not to mention the cost. Our supplier requires we return the faulty part to check to RMA before they give us a new one. So we need to BUY a new one before we can get the old one fixed. Which is ridiculous! Which is why I have said that the warranty is practically worthless.

In India, we do not need to buy a new one. We provide IPMI access or physical access to the distributor support and they contact Supermicro for Advance replacement. Supermicro sends over a part (1-2 weeks) , then we schedule a downtime with customers and get it replaced (1-2 weeks).

Dumbledore · March 2018

Softlayer's majority servers are Super Micro. If SM is crappy as you say they wouldn't use it.

randvegeta · March 2018

Dumbledore said: Softlayer's majority servers are Super Micro. If SM is crappy as you say they wouldn't use it.

I wonder if they use Microclouds.

@leapswitch, do you have problems with all models of the Microcloud or do you see more problems with certain models over others? For example.. do you have any E5 based MCs?

leapswitch · March 2018

@randvegeta said:

Dumbledore said: Softlayer's majority servers are Super Micro. If SM is crappy as you say they wouldn't use it.

I wonder if they use Microclouds.

@leapswitch, do you have problems with all models of the Microcloud or do you see more problems with certain models over others? For example.. do you have any E5 based MCs?

We have only 8 node E3 MCs . No E5 or 12 / 24 node E3s.

Dumbledore · March 2018

randvegeta said: I wonder if they use Microclouds.

Yes, they use FatTwins and Microclouds.

randvegeta · March 2018

@Dumbledore said:

randvegeta said: I wonder if they use Microclouds.

Yes, they use FatTwins and Microclouds.

Then perhaps those are more reliable? I am perfectly willing to accept that they may have different levels of reliability for different product lines.

SplitIce · March 2018

One thing I wish providers who rent supermicro servers would do is reset the automatically BMCs of their deployed fleet on a regular basis. Especially those who don't provide full IPMI privileges (customer provided operator user can't do this). That's all it takes to prevent the KVM going awry normally.

msg7086 · March 2018

We had a batch of SM servers that crash all the time with Xen hypervisor. A later batch is fine, though. We suspect that was a bad batch of motherboard, but tbh we have no clear clue.

Shazan · March 2018

@msg7086 said:
We had a batch of SM servers that crash all the time with Xen hypervisor. A later batch is fine, though. We suspect that was a bad batch of motherboard, but tbh we have no clear clue.

Did you try to update the BIOS?

msg7086 · March 2018

@Shazan said:

@msg7086 said:
We had a batch of SM servers that crash all the time with Xen hypervisor. A later batch is fine, though. We suspect that was a bad batch of motherboard, but tbh we have no clear clue.

Did you try to update the BIOS?

That was some years ago and these boxes were shipped to one of our customers for virtualization. At that time the problem couldn't be resolved, and I guess we just replaced those servers? IDK.

birchbeer · June 2018

For the security work that we do, there are not many options for Dual Xeon high-end workstations that can support 4 or more GPUs.

I wish that I saw this post before we bought a pair of Supermicro GPU Workstations 2 months ago. One of the workstations had a failed DIMM slot so it had to be RMA'd. Last week, when we added memory in both workstations, the second Workstation reports ECC memory errors even though there's nothing wrong with the RAM we are using. Basically we are batting 100% failure rate on our experiment with Supermicro equipment.

And these workstations aren't really designed to be workstations either. They are actually 4U servers turned on it's side. It's a terrible design for a Workstation because they are way to loud to be used at full load because of the high-static fans in them.

After I mentioned our failure with Supermicro to one of our customers, they confirmed similar problems and they were planning to replace about 60 of their Supermicro servers in one of their data centers.

It seems like Supermicro quality control has really gone down in recent years.

randvegeta · June 2018

birchbeer said: Basically we are batting 100% failure rate on our experiment with Supermicro equipment.

Unfortunately, your 100% failure rate is meaningless because 2 does not make a reasonable sample. Apparently even my sample of 10 Micrclouds (80 nodes) is not sufficient. So apparently our experiences can be completely discounted entirely

Clearly SM has near 100% reliability as anyone who says otherwise doesn't have a sample size large enough to be meaningful, so they can be ignored entirely. On the other hand if you have a sample of 1, and it's good, then that's fine. It's only not fine if it's bad.

birchbeer · June 2018

@randvegeta said:
Unfortunately, your 100% failure rate is meaningless because 2 does not make a reasonable sample. Apparently even my sample of 10 Micrclouds (80 nodes) is not sufficient. So apparently our experiences can be completely discounted entirely

Yup - I completely agree with you. Perhaps we are just unlucky. But since we are a small shop, we are just going to go back to using Tyan mobo's and our custom builds. We've been pretty lucky with other brands in the past.

swsh007 · June 2018

asus.

William · June 2018

While i personally do not like SM gear it is very simple here: You get EXACTLY what you paid for, since HP Gen8 (E5 v1) SM gear is like 25% cheaper always but as you so also... 25% more crap at least.

The IPMI is the most useless of all, only thing one can give them is the usual usage of Intel over Broadcom NICs.

Corey said: To top it off their BIOS takes an atrociously long time to boot.

You ever turned on a HP DL360/380? Or a DL580... a DL980 can take 10min to boot...

randvegeta said: Ordering from HK dealers also seems to have zero benefit over a US dealer. HK companies do not stock supply, so the ETA on delivery is 3-4 weeks

Why the fuck would you order SM gear in HK? It's a damn Taiwan based company, order in Taiwan or in EU/US from their NL/US branches. Your country has no import tax and purchasing local is dumb, nobody does it, which is EXACTLY why there is no/not much stock. These servers... literally come from Taiwan to HK in this 3 weeks. It makes zero difference where you order.

randvegeta said: No. It's not that many microcloud Chassis, but does come to about 80 servers worth

I can confirm similar issues as you've seen yea, and on top of that SM has notably more issues with power backplanes in dual PSU servers.

birchbeer said: And these workstations aren't really designed to be workstations either. They are actually 4U servers turned on it's side. It's a terrible design for a Workstation because they are way to loud to be used at full load because of the high-static fans in them.

A ML350 from HP is also just a sideways turned server...

swsh007 said: asus.

That was a good one

michaels · June 2018

I find that the only way to buy new supermicro's is if you have them built and have the burn in testing done. I found that component builds do have a high failure rate. Even getting the distie / SM to build the servers for you, they still come in cheaper than Dell / HP.

jsg · June 2018

Sad story considering that SuperMicro is a brand very many like and buy. But there seems to be a pattern. I remember other Taiwan companies who once they got big "optimized profit" and got considerably worse in quality. Sad.

Clouvider · June 2018

Gosh, this thread again...

@michaels said:
I find that the only way to buy new supermicro's is if you have them built and have the burn in testing done. I found that component builds do have a high failure rate. Even getting the distie / SM to build the servers for you, they still come in cheaper than Dell / HP.

That's true and worth mentioning that Dell doesn't sell barebones. You can't compare barebones vs built and tested servers, there's a reason why they are more expensive built...

If you wanna save money by becoming an integrator you have to assume the risks of an integrator. You can't have your cake and eat it...

randvegeta · June 2018

Clouvider said: Gosh, this thread again...

When SM has problems, and threads like these will be revived.

Perhaps I'm not the only one with problems after all .

Clouvider · June 2018

@randvegeta said:

Clouvider said: Gosh, this thread again...

When SM has problems, and threads like these will be revived.

Perhaps I'm not the only one with problems after all .

SM doesn't have problems, that's the whole point.

randvegeta · June 2018

Clouvider said: SM doesn't have problems, that's the whole point.

>

How do you figure?

randvegeta · June 2018

William said: Why the fuck would you order SM gear in HK?

Well when you talk to official dealers in HK, they talk a good game at first. They don't tell you that they don't keep stock or actually provide any value. They tell you they will handle your warranty issues and provide replacements and what not.

In reality, they do their best to avoid dealing with problems, and when they do agree to take back equipment, they have ALWAYS charged for the work that should have been covered under warranty. And I do mean ALWAYS. Basically saying that the problem was caused our mishandling. HK vendors for SM are expensive and not at all helpful. They don't even know which hardware is compatible.

Their volumes are so low that it's cheaper to buy from a USA or EU vendor and air mail it. By sear it could be cheaper. Our EU vendor sometimes ships directly from TW for us too. RMA is a bitch, but they are more willing to accept it than HK vendors, and that makes a world of difference IMO.

Clouvider · June 2018

@randvegeta said:

Clouvider said: SM doesn't have problems, that's the whole point.

>

How do you figure?

Because, as concluded earlier and in many other threads:

Clouvider said: You can't compare barebones vs built and tested servers, there's a reason why they are more expensive built...

If you wanna save money by becoming an integrator you have to assume the risks of an integrator. You can't have your cake and eat it...

Start paying a reliable integrator to deliver you servers and you'll suddenly have much higher quality (and the bill).

AlyssaD · June 2018

@deank said:
Wouldn't fluctuations filtered by UPS?

We had a bad batch of UPS equipment that was shorting out HP switches. UPS can sometimes become a culprit rather than the solution.

deank · June 2018

The end is nigh.

Howdy, Stranger!

Categories

In this Discussion

Shitty SuperMicro

Comments

Howdy, Stranger!

Quick Links

Categories

In this Discussion

Shitty SuperMicro

Comments