From the blog.

Managing Digital Racket
The more I tune out, the less I miss it. But that has presented me with some complex choices for a nuanced approach to curb
Complexity – My Friend, My Enemy
Over my years of network engineering, I've learned that the fewer features you can implement while still achieving a business goal, the better. Why? Fewer

The Ethernet Switching Landscape – Part 05 – Equal Cost Multipath (ECMP)

676 Words. Plan about 4 minute(s) to read this.

This is one of a multi-part series on the Ethernet switching landscape I wrote to support a 2-hour presentation I made at Interop Las Vegas 2014. Part 1 of this written series appeared on NetworkComputing.com. Search for the rest of this series.

In data center design, the ability to use all available links to forward traffic is an important consideration. Not only is using all link capacity wise from a network design standpoint, but it’s also wise from a financial standpoint. 10GbE and 40GbE link are expensive. The ports take up real estate in a switch. The cabling takes up space in a cabinet or uses costly fiber runs between cabinets. Switches that have non-blocking forwarding capacity are being wasted if their lit links are not all being utilized.

MLAG is one way in which spanning tree limitations are overcome, allowing traffic to safely forward on all links in a multi-chassis bundle, even though the physical topology forms a loop. Another way to achieve forwarding on all links is by leveraging equal cost multipath, or ECMP.

ECMP comes in both L2 and L3 flavors. For purposes of this discussion, MLAG does not qualify as L2 ECMP. Why? Because any aggregated link bundle presents itself as a single link to the underlying forwarding mechanisms in the switch. Even though there are multiple parallel links that can forward at L2 in an MLAG bundle, they do not function independently of one another. That said, multiple LAG or MLAG bundles in parallel do qualify as L2 ECMP.

Before we get into the specifics of ECMP, let’s define what ECMP is. As the name implies, ECMP means that 2 or more paths with identical properties are available for a switch to forward across to deliver traffic to a particular destination. The simplest example of this is 2 10GbE links connected in parallel between two switches. Both links are the same speed. Both links are the same number of hops away. Together, they represent multiple paths of equal cost. Each link is identically capable of delivering the datagram to the other switch.

The ECMP question becomes one of protocol. By what means is the switch able to see these parallel links as ECMP links? There are several ways.

In the world of L2, there are two significant ECMP strategies: TRILL & Shortest Path Bridging. Both TRILL and SPB provide a means to route frames across an Ethernet fabric. I’ll discuss some of the details of TRILL & SPB in a future entry in this series. For now, recognize that since nearly all TRILL & SPB implementations base their forwarding schemes on the routing protocol IS-IS, equal cost paths are a possible outcome. The happy result is that frames can be forwarded across several links in parallel. Thus L2MP. Take a look at the diagram below.

ECMP-ethancbanks

In the world of L3, routing protocols are again used to provide ECMP services, OSPF being a common choice inside the data center. L3MP is not especially new, as equal cost paths that can both be utilized have been a feature of certain protocols and platforms for a number of years now. In the context of Ethernet switching and data center design, the interesting part of L3MP is in just how many paths are supported by a given platform. For example, Arista Networks has made a big deal of their ECMP via L3 capabilities, touting 64-way ECMP in their 7500/7500E platform. This allows a designer to take a leaf/spine design (a small, 2-tier version of which is shown above), and scale it quite widely, while still using all of the interswitch links for forwarding.

Worth mentioning is that software defined networking (SDN) can also accomplish ECMP forwarding. OpenFlow is one southbound protocol that could be used for this. The idea is that rather than a distributed routing protocol like IS-IS or OSPF discovering neighbors, learning a topology, determining link costs, and selecting a forwarding path, instead a central controller discovers the topology and best path through, and installs appropriate entries into switch forwarding tables directly.