Friday, March 9, 2012

dt-lacp Distributed Trunking

Distributed Trunking -

Now everyone understands how two switches connected logically (stacked) can balance an aggregate link between them with a shared logical backplane, as cisco does, but what about HP who doesn't have stacked switches, where just a few higher end switches have the ability to do distributed trunking. A server (as it is not supported for switch to switch) can connect to two separate switches and create a single LACP trunk to both.

*note - the two switches must be directly connected, and it is a little unclear but it sounds like they have to have a dedicated link between them as the dt-trunk link, apart from the regular link. (now this does not have to be aggregated, could be a single gig connection, but as you will see in a moment it should be of equivalent bandwidth to each switch's lacp link.

Now this seems simple but what happens under the hood? With HP's usual lack of documentation it took me a while to dig up, but basically, create a non-negotiating trunk. (eg. lacp auto will not work), and connect the server. The server sends a frame to switch A which learns the mac and forwards it to the destination. Switch B also learns the mac as being on switch A. Switch B receives a frame and, having learned the mac as being on switch A, forwards the frame to switch A to send to destination.

So in essence, you are creating redundancy, but as for doubling bandwidth, ALL traffic will proceed through a single switch so both the inter-switch link and EACH switch's core uplink must be able to support the full server lacp bandwidth, for all it's links.

Eg. Server has 2gigs to switch A and 2gigs to switch B. Switch A needs at minimum 2gigs to switch B. Switch A and Switch B both need at minimum 4 gigs up to the uplink/core. (assuming you want the server to have 4gigs of bandwidth to the core) This is of course on top of any other traffic the switches will be forwarding from other devices.

Of course spanning tree is going to have to block a link somewhere anyway, so all traffic forwarding through a single switch may be the expected outcome anyway.

Cheers!

Oh ya, the config:

ProCurve Switch 1(config)# switch-interconnect a1
ProCurve Switch 2(config)# switch-interconnect a1
 ProCurve Switch 2(config)# trunk b20-b21 trk99 dt-lacp
ProCurve Switch 1(config)# trunk c10-c11 trk99 dt-lacp

 and of course if you needed an aggregate link between switches, replace a1 with trk5 and add whatever ports you needed to trk5 just like a regular trunk port.

  show lacp distributed is your show command to see the lacp status of BOTH switches

7 comments:

Unknown said...

Makes me wish I kept up with Network Engineering. Oh well, maybe next decade.

Optim Servers said...

Wow what a nice post i am really so inspired here could you more share here i will be back to you as soon as possible keep continue sharing.
Thanks



Cisco CP-7961G-GE

Anonymous said...

Distributed Trunking seems like a half baked idea what is the point if you can't connect switches split over the two in the DT

so you cant do this?

Site A Switch 1 (ISC to Switch 2)
Site A Switch 2 (ISC to Switch 1)

Site 2 Switch 1 (2 cabled, one to each switch in Site A, and using LACP)

no good?

Unknown said...

As said, if it would only be for connecting servers to 2 independent switch and still have aggregated links, it would be half baked as Stylus pointed out so eloquently. I'm afraid though that it is not true that these connections are primarily for attaching servers. Not only it wouldn't make sense from a network perspective, but given the limited number of Distibuted Trunks allowed (if I recall correctly 64 maxed) you wouldn't be able to put in your entire server environment in this state, unless you are running relative small environment. and small environments usually don't have budgets for switchs with this capability. Having for instance a 8212zl switch will give you significantly more ports and possibilities that exceed this limitation.

DT is by all means meant actually to provide additional redundancy and bandwidth between switches, much like in the way it is done with a cisco stack. It is more complicated and demanding especially when you want one DT cluster talking to another DT cluster (since all switches require to be connected in a full mesh to function correctly) like the following situation

.-------. .-------.
| | | |
| 8200A |- -| 8200B |
| | | |
'-------\ /-------'
| \ / |
________________________

| / \ |
.-------/ \-------.
| 6600A |- -| 6600B |
'-------' '-------'

Indeed, DT needs a direct link between the 2 devices in a DT cluster, actually 2. The switch model of the second node in your DT cluster can be of a different making as long as firmware is equal (8200zl, 5400zl and 6600 share the same firmware code). The primary direct connection between the 2 nodes in the cluster is called your interswitch connect (ISC) and is automatically added to each and every VLAN you add to the switch you are configuring. Ther is no way to eliminate that VLAN from the interface (port or trunk/portchannel) you have configured to act as ISC. The only VLAN excluded from that port is the one configured as Peer Keep Alive link. This link only comes into action should the ICS interface get down by removal of the link or damage to the port/module of that swich. In that case the PKA link will start checking if the "other side" is still alive itself and if so, an election will be forced where the switch with the lowest MAC-addres will get preference inside the cluster (the master switch if you want to call it.) the "slave" will then block all configured DT-trunks so that its partners will know to send all data to the other switch.

Unknown said...

Using DT has some serious drawbacks, amongst which that DT-Trunks should not be actively participate in spanning-tree (meaning each interface in a DT-team will require to have a bpdu-filter enabled and is done so automatically by later firmware revisions) and some IGMP limitations. Since the number of DT trunks is limited, if you choose to attach your servers to both switches without trunking, you may encounter a condition where your server will send data out to your "slave" meaning it will end in the big nothing since all DT's have been disabled. Also a DT cluster does not share the switch configuration, meaning that you will need to duplicate the configuration manually and implementation of additional functionality like VRRP(E) if one if the nodes is acting as Layer-3 router. On the positive side, since its ISC and Peer Keep-alive links works completely on regular Layer Ethernet, your connection distance between the 2 nodes is limited only by the distance you can get your Layer2 to cover. That means you could stretch your ISC across to e.g. a disaster recovery site or twin dataroom if you really want to - if your latency is low enough.

Having said that, you do need to question yourself if you would seriously want to put in such mechanism as I find it relatively sensitive to errors and mistakes, but it certainly helps in increasing bandwith in your core network. - note though that it behaves as any other LACP trunk/portchannel with the same balancing algorithm behind it, so a single ip session/stream will never ever be distributed across multiple links in the DT.

Hope you find the information a bit usefull

P.S. on a side note, nowadays HP does infact offer switches with stacking capability, but supporting models (e.g. the 2930) usually don't do it out of the box. you are required to add an optional stacking module.

Unknown said...
This comment has been removed by the author.
Unknown said...

the ascii drawing above should look more like:

.----------.    .----------.
|             |     |             |
| 8200A |----| 8200B |
|             |     |             |
'----------\    /----------'
        |        \/        |
        |        /\        |
.----------/    \----------.
| 6600A |----| 6600B |
'----------'     '----------'