Like any IT discipline, QoS is awash in terminology & acronyms. I’m going to tackle the most common QoS terms here, and try to provide some context in my definitions. Ideally, you’ll know not just a definition of the term, but also how the term fits into the larger QoS ecosystem.
ToS – ToS stands for “type of service.” The ToS value is stored in a byte of the IP header of a packet. The 8 bits making up the byte are broken into two fields. In modern networks, the first 6 bits represent the Differentiated Service Code Point (DSCP) value, while the last two set Explicit Congestion Notification (ECN) values. For our purposes in this series, ECN is not a concern while DSCP is. The DSCP value is the number that distinguishes this traffic class from other classes; there are 64 possible values, 0-63. QoS policies use the DSCP value to identify a traffic class and apply a QoS behavior. Notably, the first three bits of the ToS field were once used to identify 8 classes of services (0-7) called IP Precedence. DSCP is backwards-compatible to Precedence, as certain DSCP values map to Precedence values. This mapping of DSCP-to-Precendence is important to understand in certain situations, as we’ll discuss later in the series. For a history of the IP ToS field, refer to RFC 3168, section 22.
PHB – PHB stands for “per hop behavior.” PHBs are RFC-defined collections of generally agreed upon ToS values. RFC2597 defines the “Assured Forwarding PHB Group”, while RFC2598 defines “An Expedited Forwarding PHB”. RFC2474 defines a “Class Selector PHB” that can be used as a guideline to maintain backwards compatibility with Precedence values, while still offering more class granularity. For our purposes, PHBs define DSCP values that most vendors and network operators adhere to for particular classes of service. If you use values from the AF or EF PHBs, those will likely be recognized by other folks.
Interface – in the context of this series, an interface is almost always the physical port on a network device that accepts or sends traffic in the form or electrical or optical signals. Specifically, we’re probably talking about an Ethernet or serial interface. Since it is possible to apply QoS policies to virtual or aggregated interfaces, we might talk about those as well, but I’ll be sure to make the distinction. There are QoS nuances that apply to virtual & aggregated interfaces as compared to physical interfaces.
Transmit ring – the transmit (Tx) ring contains a packet ready to be sent across the wire. If the Tx-ring already has something in it, then the physical interface is already sending something (i.e. is congested). If the Tx-ring is empty, the physical media is ready to send something (i.e. is not congested). Think of the Tx-ring as the last holding tank a packet goes through before being placed onto the wire; the Tx-ring sits in between the physical interface and the network device’s scheduler that determines which packet to send next.
Ingress vs. egress – these are terms of directionality referring to traffic, and are always in the context of an interface. “Ingress” traffic refers to traffic that is flowing into an interface. For example, traffic coming from a host plugged into an access layer switch is ingress traffic. The host is sending traffic that is flowing into the switch interface. “Egress” traffic refers to traffic that is flowing out of an interface. For example, traffic being sent from an enterprise to the Internet is flowing out of the router’s WAN uplink interface. Directionality is important to understand, because QoS policies are applied to interfaces in a specific direction. If you get the direction backwards, your QoS policy won’t accomplish what you intend for it to.
Queueing – if a queue is a waiting line, then QoS queuing is the process of handling backed-up packets (i.e. putting them in a waiting line, so to speak). Queueing only happens when the interface is too busy to send all the packets that have arrived at the device. If there is no interface congestion, then there is no need to queue packets; packets are sent in the order received. If packets have been queued due to congestion, a scheduler determines the order in which packets will leave the queue.
Buffer – the area in which queued packets wait. The larger the buffer, the more packets that can be held. Buffers should be sized appropriately to handle the type of application traffic they are intended to service. If a buffer is full, it is unable to accept additional queued packets, in which case the most recently arrived packets are dropped, as they have nowhere to go.
Congestion management – a set of QoS tools that help determine what packets get sent in what order when an interface is congested (i.e. has more traffic to send than the interface bandwidth is capable of carrying). The idea here is, “Okay, we have a problem. How are we going to handle this problem?” Queueing strategies like class-based weighted fair queueing and low latency queueing fall into this category.
Congestion avoidance – QoS tools that try to prevent an interface from becoming congested by proactively dropping packets. Random early detection falls into this category, relying on TCP’s inherent behavior of slowing down transmission speed when the recipient does not acknowledge receipt of traffic. The idea here is, “We’re about to have a problem. What can we do to avoid this becoming a problem?”
Class-based Weighted Fair Queueing – weighted fair queueing allows a network device to identify flows on the wire and share bandwidth equally among the flows. With WFQ, no one flow can drown out the other ones – all flows are treated “fairly.” Class-based WFQ takes this logic one step further by allowing the network operator to identify specific traffic classes and instruct the network device via how bandwidth should be allocated to those classes. In other words, a human writes a policy that determines fairness and how weighting should be done.
Low Latency Queueing – an LLQ is a high priority queue that preempts other traffic classes without starving them for bandwidth. An LLQ accomplishes three things. First, an LLQ reserves bandwidth for a traffic class, meaning that no other traffic classes can utilize that reserved bandwidth. An LLQ also limits the amount of bandwidth that traffic class can take to the amount of reserved bandwidth, dropping traffic that exceeds the reserved rate if the interface is congested. Finally, an LLQ will dequeue packets on a regular time-slice interface to minimize jitter.
Weighted Random Early Detection – when an interface is congested, the WRED algorithm can be applied to drop traffic from certain flows. The point of WRED is to reduce interface congestion by encouraging TCP flows to slow down through drops. “Weighted” RED means that some flows can be deemed more important than other flows; by default, WRED observes the IP Precedence value to determine flow importance, but WRED can also be configured to observe DSCP values.
TCP global synchronization – this phenomenon occurs when backed up packets from all flows arriving at a congested interface are dropped at the same time. The TCP behavior of these flows then become synchronized, as all of them slow down and ramp back up at the same time. This causes a sawtooth graph of bandwidth utilization, instead of smooth utilization. WRED helps mitigate global synchronization by dropping traffic from different flows at different times.
Shaping vs. policing – both shaping and policing are QoS tools that limit the rate of a class of traffic to a specific number of bits per second, but they are different in how they go about this task. Shaping has a buffer that allows for bursts of traffic with fewer drops. As traffic bursts above the shaping rate, the excess is buffered (to a point), and delivery subsequently delayed. TCP traffic responds to the delay and ends up conforming more or less gracefully to the desired shaped rate, with a resulting smooth “plateau” graph. Shaping can be applied to outbound traffic only, as logically, buffering can only be applied to egress traffic to delay sending. (Ingress traffic can’t be delayed, because it has already arrived.) Policing offers no buffer; traffic that exceeds the policed rate is dropped. The resulting traffic pattern is not smooth, with peaks and valleys forming as TCP reacts to the dropped traffic and adjusts its sending rate. Policing can be applied to either ingress or egress traffic, as there is no buffering required. I like to think of shaping as a kind, gentle way to enforce a bit rate, while policing is cruel & heartless.
Cisco NBAR – Network Based Application Recognition is a way to identify traffic flows from specific applications such as HTTP, HTTP, FTP, etc. transiting a network device. From my perspective, the real value of NBAR is to aid identification of complex applications possibly with dynamic port assignments, and then be able to mark that application’s traffic with a specific DSCP value. In a practical sense, NBAR has limited utility, as the applications it can identify are limited. In addition, many applications a network operator might not want on the network such a file-sharing applications can masquerade as innocent, encrypted HTTP sessions, making their identification difficult.
Cisco MQC – in Cisco-speak, the Modular QoS Command Line Interface is a way to describe a QoS policy in a common was across many Cisco platforms. The MQC policy language is used to create traffic classes with class maps. Policy maps create QoS policies that inform the network device how to handle the traffic class identified by the class map. When the policy map is applied to an interface in either an ingress or egress direction, it can begin to act on traffic transiting the interface. While there are still major differences in creating and applying QoS policies on various Cisco platforms, Cisco has committed to MQC heavily across IOS platforms over the years, phasing out some legacy ways of describing a QoS policy. Understanding MQC is key for network operators applying QoS policies to Cisco devices.