From the blog.

Managing Digital Racket
The more I tune out, the less I miss it. But that has presented me with some complex choices for a nuanced approach to curb
Complexity – My Friend, My Enemy
Over my years of network engineering, I've learned that the fewer features you can implement while still achieving a business goal, the better. Why? Fewer

The Ethernet Switching Landscape – Part 03 – Different speeds for different needs.

1,800 Words. Plan about 12 minute(s) to read this.

This is one of a multi-part series on the Ethernet switching landscape I’m writing in preparation for a 2-hour presentation at Interop Las Vegas 2014 (hour 1 & hour 2). Part 1 of this written series appeared on NetworkComputing.com. Search for the rest of this series.

When considering an Ethernet switch, the sort of a network you’re trying to build factors into the decision. For example…

  • A campus network with a high-speed core that fans out into closets at each building will do fine with a great deal of oversubscription at the access layer, 100:1 not unheard of.
  • A small data center supporting enterprise applications will likely do just fine with 3:1 or more access-layer oversubscription, assuming a traditional mix of sporadically loaded SQL, HTTP, mail, and file/print services. Tolerance for oversubscription in this environment might be tempered by the volume of IP storage on the wire.
  • A large data center supporting pods with a high density of virtual machines & converged storage traffic will be more sensitive to access-layer oversubscription, and may be less tolerant of dropping packets. A fat uplink to the the aggregation tier coupled with a buffer-rich access layer could be in order.
  • Fully and partially meshed L2 topologies demand switches with MLAG (in some flavor), TRILL, or SPB to facilitate forwarding on all links. Not all switches have these capabilities.

I raise these different scenarios not to suggest a design clinic, or even harp on oversubscription concerns between network tiers. Rather, I’m suggesting that understanding what your business requires from applications coupled with your physical environment will in part drive what you require from an Ethernet switch. Campus access switches don’t need 10GbE facing the users, but might benefit from 10GbE back to the aggregation or core tier. Dense data center pods facing hypervisor hosts and IP storage arrays with multiplied 10GbE might benefit from 40GbE uplinks back to the aggregation layer. Data centers running a even split of 1GbE physical servers and aggregated virtual servers might need different switches in different parts of the data center.

The Ethernet switch story is not a simple tale of “faster is better.” Rather, the story is one of “different speeds for different needs.”

Use-cases & considerations driving various Ethernet speeds.

The most common Ethernet switches available today contain predominantly 1GbE access ports. If you have servers or workstations with 1GbE access requirements (don’t we all?), then you’ll be shopping for a switch in this class. Here’s a few buying considerations.

  • Do you need ports that support 1000Mbps, 100/1000Mbps, or 10/100/1000Mbps? While the vast majority of 1GbE switches will offer auto-sensing 10/100/1000Mbps ports, this is not a given. Read the fine print. This is especially key for SFP modules.
  • 1GbE switches are sometimes distinguished by their uplink options. For example, some Cisco 3750X switches offer an uplink module with either 4x1GbE or 2x10GbE ports. You need to understand both how much oversubscription is acceptable, and the capabilities of the uplink module.

    By oversubscription, I’m referring in this case to the amount of bandwidth facing hosts versus the aggregation or core layers. For example, a switch with 48 x 1GbE host-facing ports and 2x10GbE uplink ports has an oversubscription ratio of 48:20, or 2.4:1. If the ratio was 1:1, there were be no oversubscription. Oversubscription of this and related types comes into play in many ways in Ethernet switching and data center design. Other examples include the forwarding capacity per slot in a chassis versus the host-facing capacity of the line card, and the internal forwarding capacity of a fixed configuration switch versus the amount of bandwidth represented by the host-facing ports.

  • 1GbE switches may be non-blocking. The notion of non-blocking is that all ports in the switch can forward at line rate in both directions simultaneously, and all at the same time. In theory, a non-blocking switch will never drop a frame, but the reality of this goes back to architecture and uplink oversubscription. How many user-facing ports are you trying to fit into how many core-facing ports? Just because a switch is non-blocking, that doesn’t mean you can fit 48x1GbE data streams into 2x10GbE uplinks.
  • It’s worth noting that Cisco Nexus 2000-series Fabric Extenders (FEXs) are not stand-alone switches. They require a Nexus switch with FEX management capabilities. I raise this issue because FEXs look like switches, are offered at an attractive price point, and models such as the 2248T offer 48 1GbE ports with 4x10GbE uplinks – an attractively low oversubscription rate.

10GbE has come into wide availability in several forms. It’s true that 10GbE ports can often be found as uplink ports on 1GbE switches, but 10GbE access switches have become commonplace in data centers as well.

  • If you’re shopping for 10GbE access switches, you’re very likely in a data center environment, servicing blade centers and possibly storage arrays. Again, the challenge is one of access-layer uplink oversubscription. While 40GbE uplinks are becoming more common, 10GbE uplink ports are found as frequently. As the market is changing for 10GbE access switches, stay on top of new product offerings. For example, Cisco has been selling the venerable Nexus 5548UP & 5596UP switches for some time, and recently made 4x40GbE uplink modules to enhance the offering. However, Cisco has also introduced new Nexus 5672UP with up to 48x10GbE + 6x40GbE as well as a Nexus 56128P with up to 96x10GbE + 8x40GbE. My point isn’t to advocate for these switches in particular (however fine they might be), but rather to raise awareness that new Ethernet switches come to market seemingly every other month from a variety of vendors. It pays to shop around when getting into 10GbE access layer switching.
  • For those folks earlier in their 10GbE adoption cycle, the use-case for 10GbE might be in the aggregation or core layers, where access switches are uplinking at 10GbE. Again, oversubscription is a potential design challenge, depending on your network’s traffic patterns. The concern is that the switch(es) you’re uplinking all of your access layer switches to have sufficient switching capacity to not become a bottleneck. Finding a non-blocking (aka wire-rate) switch to play this role could be important. Know your traffic patterns.
  • For those trying to squeeze the last bit of RoI out of an aging chassis such at the Cisco Catalyst 6500, understand that older switches tend to have limited backplanes. Yes, you can purchase a 16 port 10GbE line card for a 6500. Should you? Well, that card is heavily oversubscribed and not really a long-term growth solution; you get port density, but not bandwidth. It’s sort of like putting a supercharged V8 into a rusted out Pinto. Sure, you can do it…but the chassis won’t be able to keep up with the engine. For any sort of 10GbE density, you likely need to move into a newer product if what you’ve got is getting on.

40GbE aggregation switches are becoming more common. I don’t feel there’s much to say here, in a certain sense. If you need 40GbE to get from your access tier to your aggregation tier, you know who you are. You’re moving a very large amount of data around your data center, and if you didn’t have 40GbE links, you’d struggle to move traffic between your data center pods at certain times, unless you really went out of your way to match 10GbE uplinks and downlinks very carefully and were quite good at load-balancing your traffic across physical links.

  • When shopping for 40GbE port density, realize that the QSFP+ form factor requires twelve fibers in parallel. If you’re going switch to switch via a pre-terminated jumper, this might not seem like a big deal. But if you’re going between data center pods, the two fibers that used to give you 10GbE won’t give you 40GbE, at least not in the QSFP+ form factor. The obvious potential impact here is to the data center fiber cabling plant.
  • QSFP+ is the most common 40GbE interface type, and many vendors have improved on the flexibility by offering an interesting breakout cable, where a single 40GbE interface becomes four distinct 10GbE interfaces. This is useful as a future-proofing strategy, allowing for a 40GbE purchase while still having to support a propensity of 10GbE interfaces in the near-term.
  • Notably, I have seen cabling adapters where a 40GbE interface is converted into a single 1GbE interface. I mention this somewhat for comedy value as the notion that something like that even exists makes my head hurt, but the vendor with this offering explained the logic thusly. Customers who are using an exclusively 40GbE switch in a ToR or EoR application might have a few legacy devices – say an out of band manager or PDU – that require 1GbE. Vendors are all too happy to get customers what they want, assuming they’ll buy in sufficient bulk.

    In a fit of absolute silliness, I wonder if I could get any venture capitalists to fund a 1GbE to 40GbE adapter. For the customer who must preserve their legacy 1GbE switching investment, while still moving into the 40GbE age. I’d sell them for half the cost-per-port of those big name 40GbE switches. And besides, who wants to forklift upgrade a switch? With the 1-to-40 adapter, you don’t have to unrack a thing. You know…I should shut up now. Someone might steal this brilliant idea. ;-)

  • With the announcement of the Nexus 9000 platform, Cisco also announced a “BiDi” optic that uses multiple lightwaves to achieve 40GbE using only two fibers instead of the QSFP+ twelve. This means that data centers can swap out 10GbE with 40GbE one-for-one. 40GbE BiDi optics are expected to be released by other vendors as well, as I understand it. If this optic interests you, you can read more about it here.
  • There are several 40GbE switch offerings available. Customers should expect these switches to be non-blocking, as their role is usually as a spine switch providing connectivity between hosts attached to leaf switches. Consider also port density, power consumption, and latency, as these characteristics can vary significantly between products.

100GbE is a thing, but for purposes of this already-too-long article, I’ll not dwell on it here. The vast majority of enterprises and data centers do not need, nor are looking for 100GbE. That said, there is a notion of building a 100GbE data center fabric with the idea of not having to worry about bandwidth for several years. Arista has made this argument, based around their 7500E line cards – build it once and forget about bandwidth constraints. A friend of mine has made the point that there’s no QoS like more bandwidth, so assuming the price is right, there is certainly some merit to this point of view. If you find yourself in the 100GbE market, know that you’re in rarefied air, and shop carefully.