Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!


Shitty SuperMicro
New on LowEndTalk? Please Register and read our Community Rules.

All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.

Shitty SuperMicro

randvegetarandvegeta Member, Host Rep
edited February 2018 in General

I'm really starting to get sick of the crap that SuperMicro come out with.

On paper, their stuff looks good, but the number of problems I have seen from them is just unreal. The failure rate, and reliability issues are just atrocious.

Don't get me wrong, some SM gear has been solid for years, but others just suck.

So far, statistically, the best experience I've had with SM is with second hand gear. Presumably because the broken ones have already been repaired or replaced. But I swear, with new gear, there are problems at least 10% of the time, and that's just way too high.

Problems Include:

  • IPMI Failing - Requires hard power reset to get it back.
  • Backplanes on Microclouds failing, resulting in hot-swap disks not working.
  • Boards shipped with dodgy pins causing short-circuits.
  • Boards just not working for no apparent reasons.
  • NIC issues that make network unstable, requiring PCIe addon card.

Problems for 1/10 of orders may not sound crazy bad, but compared to consumer gear, we had probably less than 1/100 units with problems. Consumer gear is considerably cheaper too.

Now in all fairness, the 2nd hand SM gear we've acquired has been much better. As far as I'm aware, we've had no problems at all with the second hand stuff we get. It's just the new stuff that is problematic.

The biggest issue with the new gear is that the warranty is completely worthless. If there are bent pins, SM (or rather the authorised dealer) never accept responsibility for it. So we gotta pay! The backplanes on the Microclouds are running live/active nodes. The failure seems to be specific to certain bays, and not all nodes, so if you want to fix it, you need to shut down all nodes in the chassis. Which means you need to accept downtime for all nodes in the chassis just to fix 1. Even if they fix that for free, it's expensive to deal with! And RMA turnaround time is can take months! MONTHS!

Failing IPMI makes it completely worthless. Why bother with server boards if you don't even get IPMI?

Anyone else see these problems or just me?

Any decent alternatives to SM? Decent meaning, not selling full bare-metal servers with proprietary boards. HP and Dell seem only to make proprietary stuff. Intel's boards require a separate module for 'IPMI'. Who else is there?

Thanked by 2DewlanceVPS coreflux
«134

Comments

  • We've seen the same issues. To top it off their BIOS takes an atrociously long time to boot. I've had an issue where a DIMM slot was bad on a board but didn't find that out until a couple years later down the road when we upgraded the RAM. I'm not sure what kind of QoS checking goes into super micro boards because too frequently there are problems.

    A few other manufacturers are starting to make boards meant for the datacenter. We've tried asrockrack in a single machine and were happy but haven't purchased anything on scale yet.
    http://www.asrockrack.com/
    https://www.asus.com/us/Commercial-Servers-Workstations/Commercial-Server-Motherboards-Products/

  • ClouviderClouvider Member, Patron Provider
    edited February 2018

    Surprisingly you’re the first person I know to have such issues and contrary to our experience with 100s of Servers.

  • You just gotta deal with it or build your own HW like online.net is doing with scaleway.
    Mabe change of distributor is worth trying?

  • randvegetarandvegeta Member, Host Rep

    Corey said: We've seen the same issues. To top it off their BIOS takes an atrociously long time to boot. I've had an issue where a DIMM slot was bad on a board but didn't find that out until a couple years later down the road when we upgraded the RAM. I'm not sure what kind of QoS checking goes into super micro boards because too frequently there are problems.

    Yes that's another weird issue! Some DIMM slots have problems, and you don't find out until you've actually started using that slot.

    I actually have quite a few boards that are otherwise in perfect working order, but have a couple of dodgy ram slots which makes the max available ram quite a bit smaller tha it should be.

    ANOTHER problem is that the IPMI KVM hardly works. Anything that uses a Java app seems to not work these days it's just a giant pain in the arse.

    The HTML5 replacement is much better IF it works. But very often it just doesn't.

    Clouvider said: Surprisingly you’re the first person I know to have such issues and contrary to our experience with 100s of Servers.

    We must have some bad luck with some bad batches or something. But this has been pretty consistent for all new SM gear ordered, no matter where or who we order from. UK or HK supplier, same BS problems!

    Ordering from HK dealers also seems to have zero benefit over a US dealer. HK companies do not stock supply, so the ETA on delivery is 3-4 weeks. And because their volume is so low compared to US or EU vendors, it's normally cheaper to buy from USA or UK. The only reason to order from HK is the warranty, but as mentioned, it's completely worthless! What's the point in warranty if you still have to pay for repairs, or needs 3 months to get it repaired.

    Clouvider said: Surprisingly you’re the first person I know to have such issues and contrary to our experience with 100s of Servers.

    Now you know 2!

  • randvegetarandvegeta Member, Host Rep

    VirtualByte said: build your own HW

    Who builds their own HW? If you mean assemble their own, then we already do. But it's not like we have a massive factory churning out motherboards.

  • DamianDamian Member
    edited February 2018

    Are you bouncing these concerns off your rep? If not, consider buying from a direct importer, because you'll find that things like...

    Corey said: To top it off their BIOS takes an atrociously long time to boot. I've had an issue where a DIMM slot was bad on a board but didn't find that out until a couple years later down the road when we upgraded the RAM.

    ...will still be taken care of by your rep, either with a cross-shipped board replacement or chassis replacement.

    The Supermicro experience improves significantly when you engage them directly.

  • ClouviderClouvider Member, Patron Provider

    Then perhaps you need to look at a server builder who will do the QA for you and deliver a ready, plug and play product.

  • randvegetarandvegeta Member, Host Rep

    Clouvider said: Then perhaps you need to look at a server builder who will do the QA for you and deliver a ready, plug and play product.

    Such angels exist?

  • deankdeank Member, Troll

    Not sure whether they are angels. I mean you have to pay some extra.

  • @randvegeta said:

    Clouvider said: Then perhaps you need to look at a server builder who will do the QA for you and deliver a ready, plug and play product.

    Such angels exist?

    Such angels probably charge you an arm and a leg. Possibly one of your kidneys too.

  • ClouviderClouvider Member, Patron Provider

    teamacc said: Such angels probably charge you an arm and a leg. Possibly one of your kidneys too.

    Not really. Well, depends on the volume I suppose.

  • hostdarehostdare Member, Patron Provider

    x10 or higher mobo has html5 vnc too

  • FoxelVoxFoxelVox Member
    edited February 2018

    I also have an X9-DRW-iF with Supermicro chassis which has IPMI issues, everytime i want to reboot it i have to power cycle on the power bar, otherwise the bios wont post.

    edit: i have one of these servers for sale with Dual E5 2620 and 64g ram, let me know if you're interested in a second-hand bargain server lol.

  • randvegetarandvegeta Member, Host Rep

    Clouvider said: Not really. Well, depends on the volume I suppose.

    From what I can tell.. HK is full of DIY servers, mostly using consumer grade HW. Otherwise I would have imagined the volume of SM would be much higher. But then again Dell are pretty big hardware providers here. But I find their prices unreasonably high.

    They do have an attractive turnaround time though. Same day replace/repair, offered 24/7/365, which is good. But their prices are quite a lot more. Even with a 10% failure rate, it's actually cheaper to just buy 20% more SuperMicro crap, and faster too since we can replace it ourselves.

    I just don't understand why they have such a high failure rate compared to cheap desktop boards. Seriously, when was the last time your desktop boards crapped out on you? I've got a server board failing with almost every order.

  • randvegetarandvegeta Member, Host Rep

    FoxelVox said: edit: i have one of these servers for sale with Dual E5 2620 and 64g ram, let me know if you're interested in a second-hand bargain server lol.

    You've got a problem board for sale? :D

    I buy almost all my 2nd hand SM gear from the UK. The seller tests each board prior to shipping out. He also accepts (and pays for) returns in the event of any problems. He's not the cheapest second hand dealer, but I trust his QC better than SM's.

  • ClouviderClouvider Member, Patron Provider
    edited February 2018

    Great, it's good to know, I suppose.

    I've got a server board failing with almost every order.

    Well, you may wish to check other potential causes of failure in that case.

  • PUSHR_VictorPUSHR_Victor Member, Host Rep

    Seen only NIC issues so far. Can't come up with a failure rate but it's higher than any other vendor I've dealt with.

  • leapswitchleapswitch Patron Provider, Veteran

    We have had a similar experience with Microclouds after which we stopped buying them. 60% of our Microclouds have had Backplanes replaced. 2-3 of them have had backplanes replaced 3-4 times and then the entire chassis replaced. It takes 1-2 months for a replacement to arrive in India. ZERO problems with 1U or 2U single server machines even though we have 100s of them deployed.

    Thanked by 2randvegeta vimalware
  • I've had issues with just about everything, though by far the least issues with Supermicro - get a new rep, today.

    Thanked by 2Clouvider randvegeta
  • randvegetarandvegeta Member, Host Rep

    leapswitch said: We have had a similar experience with Microclouds after which we stopped buying them. 60% of our Microclouds have had Backplanes replaced. 2-3 of them have had backplanes replaced 3-4 times and then the entire chassis replaced. It takes 1-2 months for a replacement to arrive in India.

    Yes I've stopped ordering the Microclouds entirely. I mean they are great for racking. Saves a ton of space, are power efficient, and are way way easier for cabling. But backplane issues really are unacceptable. I'm now trying to source just the chassis (with backplanes) just to make sure we have sufficient redundancy in case more issues arise.

    But new builds are exclusively in 2U chassis now. Far easier to deal with.

    Aidan said: get a new rep, today

    We already source from 3 different vendor (mainly we buy from 1 in EU and 1 in HK). But both seem to be problematic. The EU vendor is easier to deal with than the HK vendor, and actually send out replacement parts for free, even before we send back the faulty unit. But shipping time and cost is a bitch.

    Thanked by 1Aidan
  • randvegetarandvegeta Member, Host Rep
    edited February 2018

    Clouvider said: Well, you may wish to check other potential causes of failure in that case.

    Only on the new stuff. Not the old stuff, and we never had problems with consumer gear when were using it back in 2010.

    Clouvider said: Surprisingly you’re the first person I know to have such issues and contrary to our experience with 100s of Servers.

    What kind of SM servers do you have? Microclouds by any chance?

    If you have not experienced any problems at all, maybe there are certain products that are less prone to problems? Or maybe your vendor has outstanding QC?

  • ClouviderClouvider Member, Patron Provider

    @Aidan said:
    I've had issues with just about everything, though by far the least issues with Supermicro - get a new rep, today.

    Still, even then it does not explain 10% failure rate described in the OP.

    We literally take hundreds of units, always new ones, from motherboards through SuperServers, FatTwins and MicroClouds in both variants. My failure rate is about 10 times less and to date, I had only had a single DoA which was the 'IPMI not working' issue in MicroCloud 12 node unit, that was fixed by a motherboard cross shipment from their warehouse that arrived in about 3 working days.

    It feels like if there's such a high fail ratio, there are perhaps some issues during assembly/ build stage rather than the factory. Unless somehow we're that lucky.

    Thanked by 1techhelper1
  • ShazanShazan Member, Host Rep

    Same here, the failure rate of our SM servers (1U) is extremely low. We experienced just a couple of PSU failures after ten years or more.

    Thanked by 1techhelper1
  • randvegetarandvegeta Member, Host Rep

    Clouvider said: It feels like if there's such a high fail ratio, there are perhaps some issues during assembly/ build stage rather than the factory. Unless somehow we're that lucky.

    If it were our handling / build process, then we would probably see more issues with our 2nd hand stuff too. But our 2nd hand stuff is far more reliable.

    Also, it's rare that the boards are DOA. It's always problems that develop over time.

    Do you have the 8 node MicroClouds or are you using the 12 (or indeed 24 node) ones?

    We are exclusively using the 8 nodes chassis.

    In any case, the most problematic of SM products seems to be the MicroClouds. 50% of the Micrclouds we have deployed have shown some sort of problems. 75% if looking only at the new ones, and 0% of second hand ones (so we have 0 problems with second hand Microclouds so far).

    The boards are less problematic, and some boards are more problematic than others. The full size eATX boards seem to be most reliable and the mATC boards seem to be the least. Not sure why size matters, but the small board generally have more problems. Mostly it's IPMI related. Luckily all our racks are equipped with APC reboot switches, so we can always reboot via that if the IPMI is down. It normally fixes the IPMI issue too.

    If we are just talking about complete failures, then it's not so large. Probably <1%. But problems of some sort, major or minor, is at least 10%.

  • randvegeta said: 50% of the Micrclouds we have deployed have shown some sort of problems. 75% if looking only at the new ones,

    Please tell me we're talking about a sample size of four.

  • edited February 2018

    What model motherboard? I've had nothing but good experiences with SuperMicro. Always rock solid. Hate their Java IPMI console but it's getting better with HTML5 support on newer models. Only thing is no virtual ISO support on HTML5 yet. Not sure what you are trying to compare them to but they are definitely a significant step up in quality compared to retail consumer stuff.

    I tried using an Intel server motherboard once and the thing kept locking up every few weeks. Now that was crap hardware.

  • randvegetarandvegeta Member, Host Rep

    Aidan said: Please tell me we're talking about a sample size of four

    LOL. No. It's not that many microcloud Chassis, but does come to about 80 servers worth. The MicroClouds are 8 nodes per chassis, so 10 Chassis. It's a small number to be fair, but it affects quite a few servers.

    Thanked by 1Aidan
  • randvegetarandvegeta Member, Host Rep

    LosPollosHermanos said: I tried using an Intel server motherboard once and the thing kept locking up every few weeks. Now that was crap hardware.

    I've found Intel's to be rock solid. But I have far fewer Intel boards, and all of them are much higher end and more expensive. The most problematic SM boards I have are for E3 CPUs. The E5s are alright, and I only run E5s on the Intel boards.

    The biggest issue with Intel is that they are more expensive, and the KVM is an optional extra. You need to pay for an actual physical chip and install it onto the board to use it. Otherwise you need an old fashion KVM device that directly connects to the VGA and PS2 ports.

  • ClouviderClouvider Member, Patron Provider

    Sample size of 10 does not justify calling them ‘shitty’ in public.

    Thanked by 1MCHPhil
  • deankdeank Member, Troll
    edited February 2018

    100 won't do, either. 10,000 maybe, bare minimum of what I'd call a proper sample size.

    Thanked by 2Aidan MikePT
Sign In or Register to comment.