I was tired. Very tired. Tired in my brain. Tired in my body. I needed to eat, puke, and scream…all of those things as soon as possible. Big cutovers are like that. You know the kind of change I’m talking about. The kind where you only get a maintenance window twice a year, so you plan to throw in the new core switch pair because that’s easy, re-tool the BGP peering that twelve other changes are waiting for, and bring up the new firewall all in one night.
Stupid! Unthinkable! Small changes only!! I mean…obviously. Of course. But sometimes, that’s just not the way it works out. And so it was that after several hours of executing a meticulously planned change that would create the network foundation for the company’s big plans, I needed to eat, puke, and scream.
You see, the change hadn’t got entirely well. It had only gone mostly well. The core switch upgrade really was easy. The BGP peering work went well enough. The new firewall was a fight, though.
At first, the firewall pair wouldn’t pass traffic. At all. Despite a lovely routing table and so on. After sitting in the freezing data center for two hours in the middle of the night staring at what seemed to be a flawless configuration while waving my laptop this way and that trying to jiggle some packets loose, I finally realized I never assigned interfaces to zones. Sorta important on a zone-based firewall, as it turned out. (Can you headdesk and facepalm at the same time? Sigh.) After that, my boss came off the ceiling, the change was back on track, and we proceeded with testing all the applications our customers relied on our network to deliver.
As the sun came up, testing was going okay. Again, mostly. We had a niggling problem with SIP calls failing sometimes. Wireshark revealed weirdness in the SIP setup packets, where internal addresses were being announced to customers sitting on the Internet-facing side of our firewall. Well…huh. That’s not how that’s supposed to work. RFC1918 ain’t reachable from the Internet, folks.
The new firewall was a Juniper SRX, and so we called the Juniper Technical Assistance Center. JTAC was using a follow-the-sun protocol, and at 6am in the eastern time zone of the US, I got routed to some poor young lady in an Indian call center.
Recall my state of mind. I was very tired, desperately wanting this change to be over so that I could eat/puke/scream. And then sleep. Oh, sleep just called to me. Longingly. Lovingly. When my mind wasn’t in overdrive trying to figure out why the firewall wasn’t doing the right thing with SIP-over-PAT, it would ponder how good it would feel to get back to my hotel room and place my head on the pillow.
Instead of sleep’s sweet caress, I was faced with JTAC’s front line triage support. The young lady was going through her script, which I couldn’t get her to deviate from to just talk like a human. I had spent much time testing and troubleshooting the issue, but she was doing what she was required to do, the rough equivalent of, “Did you turn it off and on again?” I was patient for as long as I could be, which I confess wasn’t long.
After a several minutes of answering her scripted questions, I lost it. I unloaded on this kid. I was yelling over the speakerphone at her. In an open office. With my boss standing next to me. In front of my co-workers and the junior network engineer I was mentoring, all of whom had either also been there all night or come in early to help with testing. In front of everyone else, I just leveled this poor JTAC girl.
Hey, I was feeling the pressure. The clock was ticking. People were coming into the office. Customers were coming online. We needed to be back to 100% ASAP. I didn’t have time for her script. I needed a knowledgeable JTAC nerd who understood SIP and SRX firewalls and what the magical Junos CLI incantation might be to make SIP-over-PAT work.
I don’t remember what I screamed at her. I just remember the rage monster coming out. Months of design and testing leading to a high-stress night of big changes had overflowed into complete exhaustion, and Bankszilla stomped all over some kid in India. I also don’t remember how she responded after after being flattened, but I think I remember that the case was escalated. Before too much longer, a clueful JTAC nerd provided me the needed Junos incantation. This was years ago and my memory fades, but I think the solution was to (counterintuitively) disable a buggy SIP fixup.
Drama over. Back to the hotel. Eat. Puke. Scream (inwardly, this time). And sleep, finally…at least for a few hours before heading back to the office to see what other issues might have cropped up.
I’ve told this story before in various settings, but there’s a reason I rehash this one from time to time.
After that incident, I learned that people won’t necessarily remember your months of meticulous planning, technical wizardry, thorough documentation, kind mentorship, or troubleshooting panache.
They’ll remember the rage monster. Is that what you want to be remembered for?