Recently I had the opportunity to make a series of videos talking about some of the challenges Network Engineers have in their day to day work. They are located here: Part 1, Part 2, and Part 3. Specifically, I talk about how we are, in many situations, our own worst enemy. So what kind of behaviors and thinking do we engage in that have a significant negative impact on us? And is there a vision for getting us out of this mess? Let’s find out, but first…
Leonard Kleinrock published the first paper on packet switching theory in 1961. Less than a year later an underground coal fire started in the town of Centralia, Pennsylvania. That fire is still burning to this very day. As a self-styled network mystic, I don’t think that these two things are a coincidence. Read on, and I think you’ll agree.
Part 1: It’s all in our Head
Traditionally networks have been implemented and operated through the “management plane.” This, very specifically, refers to the set of interfaces on network devices that allow us to configure and troubleshoot them. This device centric view of how we interact with the network runs counter to our need for understanding how the network behaves on an end-to-end basis. When we produce the various supporting documents for the network, we almost never capture this high-level view of the end-to-end services adequately.
Perl, it is said, is a “write-only” language. Often times people who write Perl code in support of their daily activities have no idea what they were trying to do when they look at that code at a later time. We’re talking about their own code here! Well networks themselves are often very similar. Various documents are never created, incomplete, out-of-date, or missing. Coming back to a part of your network that you designed and implemented can sometimes be a lot like reading Perl code a year later. I have often wondered, “Why did I do this? What was I trying to achieve here?” when troubleshooting. Over time, changes creep in and complicate the situation even further.
Part 2: Exceptions are the Rule
This situation gets more twisted when we finally stop and acknowledge that over time lots of variation starts to appear in our network design and configurations. All kinds of constraints cause us to deviate from our “network standards.” As we build out the sections of our network, we find ourselves making exceptions over and over again.
We often say, as network engineers, that if only we kept things consistent and simple then all this would be a lot easier. After 20 years in networking, I have learned that this is naive thinking. Whenever something unpredictable happens in the network we have a knee-jerk reaction to it. We put our shocked face on and say, “Whoever did this made it so complicated! If only we could adhere to our standards!”
New application requirements, new technologies, and a seemingly endless array of unforeseen engineering challenges and issues are what defines the practice of Network Engineering. The network operations team may win some battles, but history has shown us, since the very first packet network, that we will always lose the war. We are almost never able to say “no” to the businesses we serve.
Part 3: Combined Effects
When you seriously consider that much of networking is all in our heads and that exceptions are the rule, and combine this with our understanding of the network engineering lifecycle, a pattern starts to emerge. Most of our frustrations stem from the fact that the loop from Design, to Implementation, to Operations is broken. Perhaps in multiple places.
This issue has existed since the first days of ARPANET in 1969. “Coincidentally” this was the first year that residents of Centralia started reporting headaches and nausea from the coal fire gases. Headaches and nausea you say? I’ve experienced these effects during major network outages. Another strange coincidence you should note: The formal eviction notice from the state of Pennsylvania to the last remaining Centralia residents was issued in 2009. The term “SDN” was coined in 2009. Centralia is psychically linked to our collective network engineering consciousness. If you deny this, you obviously hate science. Obviously.
Network operations have been a dumpster fire ever since the first days of packet networks, just like Centralia, Pennsylvania and just like the residents of that poor town, it’s time for us to acknowledge reality and move on. The Apstra Operating System (AOS) addresses our challenges as network engineers by taking on this enormous challenge directly. It does this in two important ways:
First, AOS carefully models the services that we need the network to deliver, and allows the network engineer to express their intent through this model, rather than through the CLI with protocols and encapsulations and all the associated knobs. We no longer have to recreate our headspace by examining cryptic configurations and the document relics we scrounge together months or years after the fact because AOS captures this intent for us.
Second, AOS uses a carefully engineered closed-loop feedback system which, on the one hand, drives the network towards a state consistent with the intent of the engineer. On the other hand, AOS is constantly validating that the state of the network is, in fact, consistent with the intent of the engineer. This loop runs continuously on a near real-time basis.
The Self-Operating Network
Together, the intent-based model and the closed-loop system finally address the multi-decadal dumpster fire that is our passion. Networking no longer has to be “all in our heads” and we finally have an effective and realistic way to deal with, in a positive way, the inevitable variations that occur in our network. Now we can build better networks faster, and provide the level of support we have always dreamed of.
AOS is effectively an always-on addition to your network team. We say “addition” because this isn’t about automating the jobs of network engineers away. This is about helping the whole network team from designers to operators get out of their own way and provide the value that businesses need from them. AOS allows us, the network engineers, to stop spending the majority of our time being firefighters and spreadsheet technicians and more time delivering the tailored and automated network services that our businesses need.
And don’t forget to watch the videos!