583 Words. Plan about 2 minute(s) to read this.
I have a lot more of this scenario to go, and my Saturday is wearing on. But I have a few comments that I have to make. First, I had a somewhat complicated BGP setup to do with a route-reflector inside of a confederation, and a lot of iBGP peers that were not on the same link. Not having iBGP on the same link made for some interesting routing challenges. Why? Well, that meant that the BGP routers were doing recursive lookups to figure out where to forward to, because the advertised next-hop was often not connected locally. So…that meant if the IGPs were not set up correctly, you could end up with a routing loop trying to get from BGP AS to AS. Which I did. And which was a pain to fix.
Now, last night, I worked on getting the BGP designed and installed. That was no problem. It was today, doing the reachability checks that I realized I was having these problems. Part of the problem was that I’d forgotten to “redistribute connected” for a couple of loopbacks on the diagram that didn’t show up in any of the previous tasks. So…I’d forgotten about them until this morning. When I redistributed them with route-maps, I broke some other routes that were previously being distributed for me automagically. When those other routes broke, I lost some BGP neighbors. And so the BGP world pretty much fell apart at that point.
At first, it was frustrating, and I was sorely tempted to break open the answer key and just review it all. But I overcame that temptation and hammered through it. I’m WAY over time budget at this point, but I decided it doesn’t matter this time. I have to OWN redistribution. Totally own it. And I don’t right now. So, I started doing traceroutes from the BGP routers that couldn’t get to wherever the problem destination was to hammer out the problems. And I got it all worked out. I have 100% IPv4 reachability now.
If you’ve been reading this blog, you know I said I had reachability a couple of days ago. So what happened? Well, before I purposely wasn’t checking the BGP routes, and I’d forgotten about the 2 loopbacks in the diagram that didn’t show up anywhere else. So I had reachability to everything I was checking. If I wasn’t checking, I was blissfully ignorant. And this reveals another problem I need to overcome. Thoroughness. I have to be more thorough. In the InternetworkExpert.com lectures, it’s been mentioned that people often fail the CCIE lab because they were careless, NOT because they don’t understand the technology.
In my redistribution work, I was careless in that I overlooked those 2 loopbacks, forcing me to go back and get them into an IGP via the only allowed method, “redistribute connected”. When I did this, I broke other things, causing an hour of troubleshooting. It was a domino-effect. I forgot 2 loopbacks, forcing new redistribution, which broke reachability, which broke BGP, which led to me sitting there doing traceroutes and fixing what I’d broken. Now, if I’d paid better attention, I would have redistributed those 2 forgotten loopbacks right off the bat, overcome whatever issues arose, and been on my way with confidence. But that didn’t happen – if this had been the real lab, there’s no question in my mind that I’d have ended up failing due to lack of time, even if I knew how to do all the other tasks.
Ethan Banks writes & podcasts about IT, new media, and personal tech.
about | subscribe | @ecbanks