Rodney, this is stupid. We’ve got nothing to do for like the next two hours! Nothing is going to go wrong. Why can’t we update the routers now?
We’ve been over this. You’re probably right. The B-Side will probably take over just fine when we update the A-Side. But, if it doesn’t, we take the entire network down. I told the customers that we wouldn’t be doing the router upgrades until 2:00AM.
But, that’s not gonna happen. We’ve done this dozens of times!
And how many times has the B-Side failed?
Maybe once, if you don’t count the times when the hardware was ancient.
It’s too risky.
And so it went. It was midnight on a “Maintenance Weekend” Friday night. We’d started with 11 tasks. Most of them were proceeding just fine. We were right on schedule to finish by 3:00am when our customers would start calling in to do verification checks.
Jake was a network engineer; one of our best. He’d done some work earlier in the evening and his next task was scheduled for 2:00am, at the end of the task list. He was upgrading the software in our core routers. These routers controlled all our traffic inside and outside our data center.
When I put the schedule together, I had intentionally put the router upgrades at the end. The chances were the update would happen without any interruption. If I had bet money, I’ve have given 99:1 odds that it would work flawlessly.
But, that 1% worried me.
Jake had a point. In his view, whether we went at midnight or 2:00am, his work was exactly the same. His risks were just as likely early as late.
But, there were other tasks currently in progress that would be impacted if we brought the network down. And there was a possibility that our customers were using parts of the network even though it was the middle of my maintenance window. If that 1% longshot came through, we’d impact them as well.
I had shared my schedule and risk assessment and impact statement for each task with my customers over the previous weeks. I had committed to holding off on the highest impact task until the end.
None of that was convincing Jake.
There is absolutely no value in waiting on this!
I knew I was in for a long couple of hours. As an engineer, Jake saw things with a very binary view. Dealing with customer expectations was my responsibilities. He just wanted to finish his work.
Finally, Brian, the portfolio director, who watched the entire exchange with some amusement chimed in.
We’ll do it Rodney’s way. We told the customers we’d wait until 2:00 we need to honor that commitment.
Jake looked like he was ready to plead his case to Brian, but thought better of it. Instead he headed off to surf the internet for a couple hours.
We finished up right on time. In fact, it was a few minutes before 2:00am when I gave Jake the go-ahead for his change.
It took 5 minutes and didn’t cause a single interruption.
Rodney M Bliss is an author, blogger and IT Consultant. He lives in Pleasant Grove, UT with his lovely wife and thirteen children.
Follow him on
Twitter (@rodneymbliss)
Facebook (www.facebook.com/rbliss)
LinkedIn (www.LinkedIn.com/in/rbliss)
or contact him at (rbliss at msn dot com)