From the blog.

Managing Digital Racket
The more I tune out, the less I miss it. But that has presented me with some complex choices for a nuanced approach to curb
Complexity – My Friend, My Enemy
Over my years of network engineering, I've learned that the fewer features you can implement while still achieving a business goal, the better. Why? Fewer

The Ethernet Switching Landscape – Part 06 – Fabric

1,180 Words. Plan about 7 minute(s) to read this.

This is one of a multi-part series on the Ethernet switching landscape I wrote to support a 2-hour presentation I made at Interop Las Vegas 2014. Part 1 of this written series appeared on NetworkComputing.com. Search for the rest of this series.

Ethernet fabric is one of those terms that has been co-opted by media & marketing folks to some degree. Like “cloud” and “SDN”, “fabric” has become a nebulous concept that doesn’t necessarily mean much of anything depending on who’s doing the talking. To some, fabric means simply an Ethernet network. Does an Ethernet host connect to some other Ethernet host via a bunch of Ethernet equipment in the middle? We have a fabric! For our purposes, such ambiguity is not useful. I think of Ethernet fabric in the context of a piece of cloth – a woven, predictable material. A fabric is the realization of a well-constructed ECMP design: a weaving of interswitch links.

I believe an Ethernet fabric should have the following characteristics:

  • A mesh of links that can be followed between source and destination hosts. Those links might be L2, L3, or a combination of the two.
  • All links are in a forwarding state.
  • A properly designed and deployed fabric topology can be updated (i.e. add or remove switches or links) with negligible disruption to hosts or in-flight traffic.
  • A fabric should offer predictable forwarding behavior, no matter where two endpoints are. The idea is that no matter where a host is plugged into the fabric, that host should be able to reach any other host in the fabric in a predictable way. For example, in scaled-out leaf-spine designs (all leaf switches connected to all spine switches), this means that hosts are only 3 switch hops away from any other other host. Host -> Leaf -> Spine -> Leaf -> Host.
  • I don’t know that I’d make this a hard & fast requirement, but I believe a true Ethernet fabric should be well-suited to carry storage traffic. For example, a fabric should support the Data Center Bridging (DCB) standard, which helps FCoE co-exist with non-FC payload frames. In the SDN world, DCB doesn’t get the press it did 2 years ago, but it’s still a relevant technology.

Legacy Ethernet networks based on spanning-tree do not meet the criteria described above. My point is that I do not believe that just any Ethernet network qualifies as a fabric merely because Ethernet is in use. Fabric as a term should have a specific meaning to network designers.

There are a number of technologies that might be found in an Ethernet fabric. At layer 2, TRILL and Shortest Path Bridging (SPB) are useful as a spanning-tree replacement, also providing ECMP. MLAG can also be useful in L2 fabrics, but often does not scale to the needs of larger environments. In layer 3 fabrics, interswitch links form an IP transport between switches, and an overlay such as VXLAN is sometimes used for network virtualization. All of these designs have their merits; there is no one right answer. One of the key elements in the design choice has to do with scale. How many virtual hosts does the fabric need to support? How many physical? Answering these questions will help determine the most appropriate starting point for a fabric design.

TRILL is a protocol providing a routing mechanism for layer 2 frames, using the notion of an Rbridge. TRILL was conceived by Radia Perlman herself as a replacement for spanning-tree. Rbridges learn the best path to deliver a given frame determining “shortest path first” via the flexible IS-IS routing protocol. Ethernet frames are routed across the fabric; original Ethernet frames are encapsulated in a TRILL frame for the trip between Rbridges. Cisco FabricPath and Brocade VCS are both based on TRILL, although both are proprietary flavors and are mutually incompatible. FabricPath can be run in TRILL standard mode, although some functionality is lost. Huawei and HP also make TRILL available in certain of their switch lines, although I am less familiar with the specifics of their implementation. I understand Huawei’s implementation of TRILL to be purely standards based. Network operators interested in TRILL should note that the encapsulation implies that TRILL is more than just a simple software upgrade. Switch silicon must support TRILL encaps and decaps in order for TRILL to operate at scale.

Shortest Path Bridging (SPB) is conceptually similar to TRILL. Like TRILL, SPB also features IS-IS as the routing protocol for frames. Also like TRILL, SPB  encapsulates entire Ethernet frames between SPB switches. SPB was also conceived as a spanning-tree replacement, although the roots of SPB are to be found in the service provider world, and not the enterprise. The similarities between TRILL & SPB end more or less there. SPB and TRILL are neither compatible nor complimentary. Rather, they appear to be competing standards. SPB is touted by advocates as “modern Ethernet,” those same advocates calling normal Ethernet “classic Ethernet” in SPB literature. Considering its service provider roots, SPB is rife with functionality, bringing with it multitenancy, multicast capabilities, and simplicity of operation. SPB vendors are also concerned about interoperability testing. Huawei, Enterasys (now Extreme), and Avaya among others all having completed SPB interoperability tests with one another, again revealing the service provider history of SPB. Enterprise network engineers interested in SPB could do well beginning their investigation with Avaya. Avaya VENA fabric is an SPB implementation, and Avaya makes a compelling case for VENA in the enterprise and data center. Although SPB also encapsulates Ethernet frames, silicon appears to be less of a concern. The most common encapsulation type used in SPB is a flavor of MAC-in-MAC, something ASICs manufactured in recent years are likely to support.

Worth mentioning is that some MLAG schemes might do a comparable job to TRILL or SPB in certain scenarios. Don’t dismiss MLAG technology out of hand simply because it’s not as trendy as L2MP schemes. Even if TRILL or SPB ends up being a data center’s ultimate fabric design choice, that doesn’t rule out using MLAG in certain places, say at the access layer to multi-home hosts.

I don’t feel a huge compulsion to discuss layer 3 fabrics here, as the rudiments of routing protocols are well-known and widely written about. I have heard it argued that layer 2 designs, whether TRILL and SPB or not, have the same fundamental scaling limitations L2 networks have always had. For massive scale then, L3 fabric is implemented. Networkers know how to build large L3 networks.

My reading suggests that the very largest organizations look towards L3 fabric design, layering overlays such as VXLAN on top to achieve maximum scale & network virtualization. That said, VXLAN is not without its own scalability concerns. Designers looking to implement VXLAN in a large environment need to look for switches that terminate VXLAN in hardware (VTEP), as well as their VXLAN vendor’s particular control-plane implementation. Some VXLAN implementations rely on multicast for flooding, but more recent VXLAN implementation do away with this design, relying instead of knowledge of every endpoint to deliver all frames via unicast.