From the blog.

Managing Digital Racket
The more I tune out, the less I miss it. But that has presented me with some complex choices for a nuanced approach to curb
Complexity – My Friend, My Enemy
Over my years of network engineering, I've learned that the fewer features you can implement while still achieving a business goal, the better. Why? Fewer

The Ethernet Switching Landscape Part 02 – Finding meaning in latency measurements.

1,010 Words. Plan about 6 minute(s) to read this.

This is one of a multi-part series on the Ethernet switching landscape I’m writing in preparation for a 2-hour presentation at Interop Las Vegas 2014 (hour 1 & hour 2). Part 1 of this written series appeared on NetworkComputing.com. Search for the rest of this series.

One often quoted statistic in Ethernet switch specifications is latency. Latency figures are usually cited in microseconds or nanoseconds. With ever decreasing latency numbers as new switch platforms emerge, what is a networker to make of latency? Is latency a key measurement to focus on when evaluating Ethernet switches? The answer to that question comes in two parts. First, there needs to be a good understanding of what latency is. Second, there needs to be a business use-case where latency matters. For many consumers, once you understand what Ethernet switch latency is, you might decide it’s not a big factor in your buying decision.

What is latency?

Generally speaking, latency measures the time it takes for a frame to enter and then exit a switch. In other words, latency is a measure of how long it takes for the switch to do its job – that of “switching” a frame in one port and out another. Latency is sometimes described as “port-to-port” latency. Think of it as the amount of time a frame spends inside the switch. That’s a rough description (and not completely accurate), so let’s look a little closer.

According to Gary Lee of Fulcrum Microsystemslatency can be measured in a number of different ways.

“There are several ways to measure latency through a switch: first-bit-in to last-bit-out (FILO), last-bit-in to first-bit-out (LIFO), first-bit-in to first-bit-out (FIFO) and last-bit-in to last-bit-out (LILO). In each case, latency is measured at the switch ingress and egress ports.”

I only raise this point about the different measuring techniques only because as you Google around looking for information on switch latency, you run into this data. Various ways to measure switch latency might seem a concern. How do you know that the vendors you are comparing are using the same method when citing their latency specifications? Reality is that practically all modern switch architectures are cut-through (at least when it comes to low-latency forwarding), meaning that the switch is forwarding the frame before the entirety of the frame has been received. For low-latency operation, cut-through switching is expected. The alternative to cut-through is store-and-forward. In this mode, a frame must be received in its entirety before being forwarded, which notably increased the time it takes for a frame to be switched, as well as making it variable. The point is this. Assuming cut-through forwarding, the only accurate way to measure latency is either LILO or FIFO.  To quote Gary again,

“These methods [FIFO/LILO] are effectively the same and are the only way to properly measure the latency through a cut-through switch.”

Implicitly, then, I believe it’s safe to assume that when looking at latency numbers for cut-through switching operation, numbers from different vendors are directly comparable. If you can find a latency measurement reported by a vendor, it should indicate a FIFO/LILO measurement of a switch in cut-through mode.

How important is latency?

Let’s take a look at latency measurements as reported by their vendors for a few Ethernet switches. Note that this is not an “apples-to-apples” comparison. These switches have different ASICs, may have different host-facing physical ports, and likely first came to market in different years. I have a point, though – bear with me.

These switches all have widely varying latency characteristics, which are market differentiators for them. But I believe that latency is only a differentiator to certain market segments. Unless you are building a network where nanoseconds count – such as high frequency trading – then the port-to-port latency of one Ethernet switch over another just isn’t going to make any appreciable difference in the performance of the applications running across your network.

Another consideration is that the Ethernet PHY (the physical Ethernet medium you use, as in the transceiver) will also impact latency. For example, 10GBase-T (10GbE over twisted-pair copper) PHYs introduces noticeable higher latency than SFP+ based PHYs. Paul Kish in the Belden Right Signals blog opines,

“With simplified electronics, SFP+ also offers better latency—typically about 0.3 microseconds per link. 10GBASE-T latency is about 2.6 microseconds per link due to more complex encoding schemes within the equipment.”

But again, the question comes back to one of business needs and application. 10GBASE-T has a use-case in top-of-rack (ToR) or end-of-row (EoR), considering 10GBASE-T server LAN-on-motherboard modules (LOMs) are coming to market. Should the higher latency of 10GBASE-T introduced by the PHY put you off? Only if microseconds matter.

Conclusion

Latency, low latency, ultra low latency and the rest are relevant terms that indicate how long it takes a switch to move traffic through itself. However, beyond the physical medium and switch architecture, many other things could impact application latency. At the end of the day, it might be fun to argue about microseconds and nanoseconds in the context of your switchs, but what difference will it make if you have bottlenecks in other places? Consider these elements of your IT engine that are far more likely to negatively impact your application performance than Ethernet switching latency.

  • Underperforming storage reads & writes.
  • A CPU or memory bound host.
  • A congested network path.
  • A slowly responding DNS service.
  • Traversal of a long-distance WAN link.
  • An overloaded proxy.
  • Traversal of a DPI device.

These sorts of problems can introduce significant fractions of a second (or even whole seconds!) of latency into an application transaction. For most shops, these sorts of issues are where the performance battles need to be fought. Buying the lowest latency switch you can find just isn’t going to appreciably help, unless you’re one of the very few with that sort of application requirement. And you already know who you are.