From the blog.

Managing Digital Racket
The more I tune out, the less I miss it. But that has presented me with some complex choices for a nuanced approach to curb
Complexity – My Friend, My Enemy
Over my years of network engineering, I've learned that the fewer features you can implement while still achieving a business goal, the better. Why? Fewer

OECG – Chapter 3 “Spanning Tree Protocol” Part 1 of 2

1,530 Words. Plan about 10 minute(s) to read this.

This probably ends up being a 2-parter. It’s 6pm already, my dinner is coming soon, and I’m a little fried from the day at work. We’ll see how far I get. Here goes.

  • Spanning-tree protocol (STP) is the language switches speak to be sure that there are no topology loops in the layer 2 network. This allows for redundancy, in that you can have 2 parallel links between switches. Spanning-tree will know that there’s 2 links, forward across one of them, and block across the other. If the forwarding link goes down, spanning-tree will move the other link from blocking to forwarding, and the network is back up. Of course, the beast that is spanning-tree has lots of lots of details that make all of these things happen.
  • STP has 3 major steps used to determine which ports will forward and which will block.
    (1) Elect the root switch – the switch with the lowest bridge ID wins. Everybody thinks they’re root at the beginning. But when they hear a superior hello (from a lower bridge ID), they forward the superior hello on.
    (2) Determine each switch’s root port – this is the port on the switch with the lowest cost back to the root switch.
    (3) Determine the designated port for each segment – when multiple switches connect to the same segment, this is the switch that forwards the least cost hello onto the segment.
  • The bridge ID has a specific format. Back in the day, it was a 2-byte priority field (0 – 65535) with the 6 byte MAC to follow. The MAC acts as a tie-breaker, in case the priorities match. Later on the 2 byte priority was broken into 4-bit and 12-bit fields. The 4-bit field is a multiplier for 4096. The 12-bit field is the system ID extension, and generally holds the VLAN ID. This change was to accommodate such things as Per VLAN ST Plus (PVST+) and Multiple Spanning Trees (MST). This 4 and 12 priority ID means that a switch can have the same MAC for every VLAN, and not have to use a BIA for every VLAN it’s running a spanning-tree instance for (called MAC address reduction).
  • Once the root bridge is decided, switches need to know the fastest way back to that root bridge. That’s called detemining the root port. So, how’s that work? The root bridge sends a Hello every “Hello timer” interval, 2 seconds by default. A switch receives the hello and forwards it on, after updating the cost, the forwarder’s bridge ID, the forwarder’s port priority and the forwarder’s port number. Hellos are not sent out of ports that are in a blocking state. The port with the lowest computed cost back to the root bridge is considered the root port.
  • Port costs: you can configure them to whatever you want if you’re trying to force spanning-tree to converge in a particular way (kind of like messing with an interface’s bandwidth statement can artificially manipulate routing convergence with certain dynamic routing protocols). Default portcosts (old/revised) are as follows: 10Mbps = 100/100, 100Mbps = 10/19, 1Gbps = 1/4, 10Gbps = 1/2. Revised are the “real-world” values you’re likely to encounter, as these were done in the late 1990’s.
  • How does a switch trying to determine root port sort out an equal cost tie?
    Lowest value of the forwarding switch’s bridge ID. (Which will sort things out most of the time, assuming there’s 2 different switches that could be traversed at an equal cost to get to root.)
    Lowest port priority of neighboring switch. (Remember, this is included in the Hello from the neighboring switch.)
    Lowest internal port number of neighboring switch.
    The last 2 are targeted at helping sort out parallel links between 2 switches.
  • The designated port then, is the port that forwards frames onto a segment. To win the right to be the designated port, the switch must send the Hello with the lowest advertised cost. The designated port is also used to send Hellos onto a segment. In case of a tie, the same rules apply as apply to root port.
  • So, now the big question: what sort of events cause STP to converge? To understand this, you need to understand what’s going on while the STP environment is stable.
    The root switch generates a Hello every Hello timer interval (2 seconds by default). You’ve probably seen these frames in a sniffer trace.
    Each non-root switch gets a copy of the one root’s Hello via its root port.
    Each switch updates the Hello with it’s own information and sends it out the designated ports.
    For each blocking port, the switch gets a Hello from the designated port (the port on the other end of the wire). He gets the Hellos, but he doesn’t forward them.
    So…that being the normal state of things, STP knows it needs to converge when something interrupts the happy flow of regularly spaces hellos from the root bridge (maxage timer expires, 10xHello=20 seconds by default). Or when hellos are showing up in unexpected places. Or when a hello claims to be superior to the root bridge you already know and love. And in that case, the whole root bridge election process starts over. On switches where the topology has changes (a port went down), that probably means a service interruption while convergence takes place. On switches where nothing’s changed, forwarding may well continue as normal. It’s not as if all switches participating in STP will decide to move everyone to listen/learn just because some newbie 3 switches away decides to tell everyone he’s root.
  • TCN = topology change notification. When things change on a STP switch, he’ll send out a TCN BPDU out his root port to the root bridge, every Hello interval until it sees a BPDU with the TCA (topology change acknowledgement) bit set. The root bridge doesn’t get the TCN via direct datagram – if there are several switches between the TCN sender and the root bridge, each switch will in turn forward the TCN BPDU out their root port until it is finally delivered to the root bridge. Then the root bridge sends out his next hello with the TCA bit set, and sends several Hellos in this way. WHY do we care about TCN’s? Because when a switch receives a Hello with the TCA bit set, it will use the forward delay timer to age out entries in his CAM, i.e. the MAC address table, or bridging table.
  • When ports need to transition from blocking to something else (they are newly elected root ports or designated port let’s say), they go through a process before they are forwarding. They will first listen, then learn, with each process taking the “forward delay” amount of time, 15 seconds by default. This is part of the reason for long delays in STP convergence.
  • PVST+ – creates a unique instance of spanning-tree for each VLAN on a switch. This is pretty spiffy, because if you set different root bridges, you can have 2 trunks lits up, with some VLAN’s heading down one trunk, and some VLAN’s heading down a different trunk, just depending on which VLAN is root port for any given trunk link. This is sort of a cheap load-balancing scheme. Cisco supports PVST+ over 802.1q, although dot1q in and of itself does not. Cisco magic makes that work. So, if you’re running 802.1q and talking to a non-Cisco switch, you need CST (common spanning-tree). This means that there’s only on STP instance running, and it’s running on the native VLAN across the trunk.
  • It’s possible to tunnel across a CST so that 2 PVST+ regions separated by the CST can still talk. This is done by sending the BPDU’s into the CST with a multicast address of 0100.0CC.CCCD, and tagging the BPDU with the proper VLAN ID. The non-Cisco switch sees this as a multicast (not a BPDU), and treats it as a multicast. The PVST+ switch sitting on the other side of the CST will catch the multicast BPDU, and deal with it appropriately.
  • Commands we care about: “show spanning tree root”, “show spanning-tree vlan 1 root detail”, “spanning-tree vlan 1 priority 28672”, interface paragraph “spanning-tree vlan 1 cost 100”, “spanning-tree vlan 1 root primary”.
  • “spanning-tree vlan 1 root primary” would set the local switch to be the root bridge with a priority of 24,576. If there’s already a bridge out there with 24,576 or less, then the local switch bridge priority is set to 4096 less than the lowest out there. This is one-time deal. This switch won’t automatically change it’s bridge priority if some other switch happens to come on the wire with a lower priority. You can run the command again, certainly, but it’s a manual process.
  • “spanning-tree vlan 1 root primary secondary” would set the local switch to be the root bridge with a priority of 28,672. No word on what happens if someone’s already out there with 28,672. (Seems sort of silly to me, unless you already know what the bridge priorities are out there…and if you know that, wouldn’t you just set your bridge priority to what you want it to be, and not leave anything to happenstance?)

Okay, my brain is done for tonight. I’ll polish this chapter off tomorrow night with a discussion on Optimizing Spanning Tree. (Cause we all know it can suck when left to itself.)