Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!


Redundant switches
New on LowEndTalk? Please Register and read our Community Rules.

All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.

Redundant switches

mpkossenmpkossen Member
edited January 2013 in General

We're having a nice challenge here at the office. We currently have 2 cabinets with one switch each. We have 2 uplinks in total for the 2 cabinets. The switches are linked to each other with a single Ethernet cable so everything keeps working when one of the uplink fails. The switches have no redundant power. And that's there the trouble starts.

We're currently looking at buying two new switches to replace the current ones. The current ones are quite old and need to be replaced. We'd like to go to a situation where the two switches are fully redundant, so we can loose one of them without noticing anything. Without redundant power supplies... (they cost about the same as a switch, which is roughly 1300 euros)

So, our idea was to put one 48-port switch into each cabinet and connect all servers to both switches. But, we would like all servers to have one IP address to handle the traffic and that IP should keep working even if one switch drops (so either both interface would need to listen to that IP or there should be some fail-over). We know ucarp, but that would still result in downtime (though minimal). We've been looking at bonding two interfaces together and we're investigation that right now.

My question is, is there any other way to achieve what we want (perhaps at the switch level)? The switch we're looking at is the HP A5120-48G EI.

Comments

  • rds100rds100 Member
    edited January 2013

    Buy a 3rd switch, keep it as cold spare (configured and ready). Accept that once ifter 5-6 years one of the switches will fail and there will be some downtime when you will have to turn on the spare switch and rewire everything to it.
    Not many switches support multi-chassis LAG. And the added complexity might result in more downtime overall, not less.

  • @rds100 said: Buy a 3rd switch, keep it as cold spare (configured and ready). Accept that once ifter 5-6 years one of the switches will fail and there will be some downtime when you will have to turn on the spare switch and rewire everything to it.

    Not many switches support multi-chassis LAG. And the added complexity might result in more downtime overall, not less.

    That's a plan, but we would still need to have someone plug all the cables. ucarp is faster (I believe we get 10 seconds of downtime max. on our MooseFS cluster).

  • mpimpi Member

    you can do failover (bond) in linux when connecting one node to multiple switches

  • @mpi said: you can do failover (bond) in linux when connecting one node to multiple switches

    Yep, that's what we're trying. I'm just wondering if there's anything else we could try (perhaps at the switch level).

  • I don't know this specific HP switch (I am not very familiar with HP), but I suggest buying switches with some kind of hitless failover functionality, such as Brocade switches. That combined with failover on the server connection to the switches, like an active-backup bond or STP.

  • vrrp.

  • mikhomikho Member, Host Rep

    The only way I can think of solving it would be if all servers have atleast 2 nics and that you can team each pair.
    Then connect one wire to each switch and enable spanning tree.

  • If you look closely there are some cheap options with redundant PSUs.

    Do you need Layer3? this makes it really expensive.
    If no or only very basic, get a Force10 S25 (not the S25P which has 24x SFP and only 4x Ethernet) - These have redundant PSUs (though not hot swapable), Good HW (Made in Taiwan, not China), FTOS is also very similar to Cisco except the much better vlan handling.

    There are 2 for sale on eBay, lightly used, for 500US$ each:
    http://www.ebay.com/itm/FORCE10-S25-01-GE-24V-24x-10-100-1000-RJ45-PORTS-4x-SHARED-SFP-L3-SWITCH-USED-/251211177975?pt=US_Network_Switches&hash=item3a7d5a5ff7

    You also need a stacking module, i recommend the 40G one as it supports the entire backplane and a 2x10G XFP module:
    http://www.ebay.com/itm/Force10-S50-01-24G-1S-24G-STACKING-MODULE-S50N-S50V-S25-S25V-QUANTITY-/290832461624?pt=US_Network_Switch_Modules&hash=item43b6f72f38
    170$ each, you might need to source for a longer cable (~100US$).
    (You can add 2 expansion cards in a S25, so you could add a second 40G stacking card to each and stack indefinitely (backplane becomes shared over 3 switches) or use a 3rd one redundant, or connect them with 2 cables for additional capacity/redundancy)

    10G module if you need one costs around 400$ (2x XFP per module, modules not included - SX/550m is around 130$ each, LX/10km 500$+ depending on wavelenght and range)

    Then conf it like your current ones are, you might need some vlan magic for the failover, and et voila, redundant everything for not even 2000$ total.

  • @William said: Do you need Layer3? this makes it really expensive.

    Funny enough, the much less capable HP 2810s cost like 50 euros more and they are plain Layer 2. So this is a fantastic offer for such a switch.

    We're specifically looking for new switches with warranty. It's not something I get to decide, unfortunately, otherwise I'd have picked up two Force10s already. Heard a lot of good things about them :) Thanks for the great suggestion anyway.

    Thanks everybody for the suggestions so far. I'll look into all of them :) If anyone has another idea, let me know!

  • One thing to note about failover, the switch has to completely crap out (link down) for OS to switch from primary to secondary. If the switch stops forwarding packets then nothing happens. I somehow managed to make this happen once, required the DC to pull the power on the failed switch.

    If you require really low downtime, Juniper/Cisco both should have virtual chassis switches that support LACP across two members. LACP sends link state packets to verify the link all the time, so this would lower any potential downtime.

  • prometeusprometeus Member, Host Rep
    edited January 2013

    The easy setup, as suggested, is a stack of two switch and then ask your upstream provider to etherchannel/lag lacp the 2 links.

    The alternative would be to setup a proper spanning tree configuration between all the switch involved to prevent loops:

    isp-switch-a <---> YOUR-switch-a <---> YOUR-switch-B <---> isp-switch-b

    You can then connect your servers to both your switch in active/passive bonding/teaming

  • @mpkossen said: We're specifically looking for new switches with warranty. It's not something I get to decide, unfortunately, otherwise I'd have picked up two Force10s already

    Force10 is available new as well (paid 1700EUR per S25P, should be much less on a normal one) and the warranty is handled by Dell...

  • @William said: Force10 is available new as well (paid 1700EUR per S25P, should be much less on a normal one) and the warranty is handled by Dell...

    Interesting. I'll give our Dell supplier a call this week if they can provide me a price. Thanks!

  • eLohkCalbeLohkCalb Member
    edited January 2013

    [Offtopic]

    Just curious, does anyone know if the Force10 S25N support ingress rate limiting with CIR/CBS?

    [/Offtopic]

  • @eLohkCalb I haven't used these switches, but usually ingress rate limiting is supported even by the almost braindead managed switches. Seems it is the egress limiting that is more complex to implement. And the Force10 should not be braindead at all.

  • @eLohkCalb said: Just curious, does anyone know if the Force10 S25N support ingress rate limiting with CIR/CBS?

    Yes it is (in+egress), requires L3 image.

  • fileMEDIAfileMEDIA Member
    edited January 2013

    We using Nexus 2000 Fabric Extender (ToR) and 5000 with vPC+ here and it works very well :)

  • @rds100, I do have problems in believing those marketing oriented datasheet, and sometimes the technical documents fail to clarify that point as well.

    My issue with this rate-limit is that I usually need to restrict the outgoing traffic from servers, and hence ingress is more important than egress. Some of these switches implement it using one rate with color-blind metering, and that will fall short when you put it into practice.

    @William, thanks for the info. :)

Sign In or Register to comment.