From the blog.

Managing Digital Racket
The more I tune out, the less I miss it. But that has presented me with some complex choices for a nuanced approach to curb
Complexity – My Friend, My Enemy
Over my years of network engineering, I've learned that the fewer features you can implement while still achieving a business goal, the better. Why? Fewer

OECG – Chapter 12

885 Words. Plan about 6 minute(s) to read this.

Whereas the previous section of the book focuses on building the local BGP table, the next section of the book focuses on how a BGP router advertises routes to neighbors and inject learned BGP prefixes into the local routing table.

BGP routers inform their neighbors about BGP routes with BGP Update messages. The format is thus:

  • Length in bytes of the withdrawn routes field (2 bytes)
  • Withdrawn routes (will vary depending on the number of dead routes)
  • Length in bytes of the path attributes field (2 bytes)
  • Path attributes (will vary depending on the number of PAs)
  • Prefix length (1 byte)
  • Prefix (will vary)
  • and so on…with 1-byte prefix lengths and variable field-length prefixes describing NLRIs following one on another to the end of the update message.

Note that the NLRIs included in a particular update message will all have the same PAs. Thus, the path attributes are critical to the Update message – they form the common bond around which the Update message is built. This reduces link overhead and CPU utilization of the router needing to process the updates. And if you haven’t hooked up an ISP router taking a full Internet BGP table lately, you’re in for about 190K routes. Any efficiency you can take advantage of is a wonderful thing.

We normally think of a router advertising every route that he knows about. But in BGP, there are routes that the BGP router will not advertise from his BGP table.

  • iBGP
    • Suboptimal routes
    • Routes denied by an outbound BGP filter
    • iBGP-learned routes (although this behavior may change in the context of confederations or route-reflectors)
  • eBGP
    • Suboptimal routes
    • Routes denied by an outbound BGP filter
    • Similar to split horizon, an eBGP router will not advertise back to his neighbor routes whose ASN is included in the AS_PATH. The obvious assumption is that the eBGP neighbor already knows about those routes, if his ASN is already in the path. So why advertise them?

So, how does a BGP router determine best route? EIGRP, OSPF and RIP all compute a metric of some sort to determine best path. And it’s been mentioned that BGP looks as the shortest AS_PATH PA. But what else? Consider the following decision path in order:

  • Choose the route with the shortest AS_PATH.
  • If AS_PATH is tied, then prefer an eBGP learned route over an iBGP learned route.
  • If the routes are still tied, look atthe NEXT_HOP PA’s. If the NEXT_HOPs differ, choose the router with the closest next-hop, where “closest” means the one with the lowest IGP metric. One or the other of a couple things need to be true about the NEXT_HOP PA, else the route will be dropped from “best” consideration:
    • The NEXT_HOP must be 0.0.0.0 (because the route was injected into the BGP table from the local router).
    • The NEXT_HOP must be reachable. NEXT_HOP must match up with a route that’s in the routing table. (And this is an absolute mind-blower the first time you run into this with something like 2 iBGP routers talking to each other and talking to 2 different ISP’s. You’ll see the routes flow in from the ISP’s fine. All the eBGP routes converge fine, pointing out to the ISP. Then the iBGP session gets rolling, and you have this rolling blackout of routes. Some routes hit the routing table that point to the other iBGP router. Then they’re gone. And you’ll bang your head right until you add a static route for the IP address of the other ISP’s next-hop in your iBGP routers. Then everything settles down, and all is well.)
  • If there’s still a tie, choose the iBGP advertised route with the lowest BGP RID.

There’s other things to keep in mind regarding the NEXT_HOP PA.

  • If you’re advertising to an iBGP neighbor, the NEXT_HOP PA is preserved. So if you learn a route via eBGP from your ISP, the NEXT_HOP will be the ISP’s router, right? Well, when you advertise that route to your iBGP neighbor in your redundant Internet edge config, the iBGP router will also get the NEXT_HOP of that ISP’s router…which is fine as long as you have a static route or IGP running so that the other iBGP router knows how to route to that next hop. If you wanted to change that behavior, you could do so with the “neighbor <xx.xx.xx.xx> next-hop-self” command, which in my example would replace the NEXT_HOP PA of the ISP to that of the iBGP router you’re advertising from.
  • Now, if you’re advertising to an eBGP neighbor, the NEXT_HOP PA is changed to the “update-source” address by default, fairly intuitive behavior. But if you don’t want this behavior, you can change it with the “neighbor <xx.xx.xx.xx> next-hop-unchanged” command.

Some notable commands you’ll like to use when looking at the BGP table:

  • “show ip bgp neighbor advertised-routes” will display the BGP Updates you’re sending to neighbors.
  • “show ip bgp neighbor received-routes” will display the BGP Updates you’ve received from neighbors, assuming you are also using the “neighbor soft-reconfiguration inbound” command configured for that neighbor.
  • When seeing prefixes listed in the BGP table the prefix with “>” next to it will point out the selected “best” route.