From a small number of hosts that are unreachable to large Internet outages, complexity sneaks up and bites network operators, sometimes at the most unexpected and undesirable times. While many mistakes in networks are manual configuration errors, the events of August 30th 2020, served to remind, that automated processes can sometimes propagate a mistake, fast, and so pervasively, that even a misbehaving part of the Internet cannot be fully withdrawn, still sucking in packets, that will never be delivered. There are always tradeoffs, some complexity is inevitable, and large networks may always be operating on the edge of chaos. Complexity is worthy of our attention.
While the book “Navigating Network Complexity” is 4-5 years old, it did seem apropos that I was reading it this last weekend. The subject is compelling and complex. The book itself covers a large scope of complexity issues in IP routing distributed control planes, in addition to some treatment of newer areas of complexity: programmable networks, centralized controllers and NFV. Not doubt we have more as an industry to learn about some of the latter.
Net-net: A good and an important read. Below is my summary / book review.
Style and Approach
- Written in 2015, published in 2016, ISBN-13: 978-0-13-398935-9
- Authors: Russ White, Jeff Tantsura; both have extensive experience
- Many examples, with a consistent framework (state, speed, surface)
- Complexity theory from outside networking is referenced
- Information dense book, so some sections can be tough sledding, though most are OK
- An extensive and detailed baseline of the subject of complexity in networking
- Does not wave a magic wand over the issue providing a quick fix to the challenge, but instead encourages practitioners to understand complexity
- Recognizes the important role that tools, and modeling, can/could play for practitioners.
- A few more summaries and quantification of state examples would have been nice, and certainly something network practitioners should have access to in their design work/tools.
So many great insights and comments, so hard to pick out just a few, but:
- Complexity is complex. Simplifying the discussion/topic is challenging
- Complexity arises from the myriad intersecting dimensions of network engineering, that also interact with overall business goals/network optimizations
- The more our expectations of IP routing increase, the more complexity we create, trying to meet those expectations. A particularly interesting point in a time where China/Huawei are pushing the ITU to develop a “New IP”
- May not find the discussion of EIGRP relevant to you. I thought it was at least interesting and illustrative, even if I will not myself be working with EIGRP
- BGP, OSPF, and IS-IS were well-enough covered. Some working knowledge of each is probably useful in reading this book.
- Bottom-line: Excellent discussion of complexity in IP routing’s distributed control plane(s).
A discussion I hear frequently is whether routing is a solved problem. This book did nothing to convince me that it is. However, against everything we know about the complexity of a dynamically routed network with a distributed control plane, is the everyday experience we all have, of many different types of Internet services, just working, most of the time. The everyday experience is IP routing, works. Of course, as our expectations rise, so too does our tendency to add even more complexity, and search for new answers, whether it be the ITU’s “New IP” or centralized control planes.
The great book “Wealth of Nations” is sometimes controversial, and often accused of not being novel, simply collating many already discovered economic truths and theories. However, even its critics would argue, that after Wealth of Nations, you could not discuss economics without referencing the book. Such is the power of a book that puts a stake in the ground and says, this is where we are. I don’t mean to imply there is nothing novel in “Navigating Network Complexity”, but I do mean to imply that people who want to wade into the murky marshes of network complexity, should probably give this book a read. There is a great deal of grounding material here. Reading the general literature on complexity, developed outside of networking, would also be informative, IMO.
In terms of the pervasive framework used in this book: state, speed, and surface (interactions / couplings), these all intuitively are dimensions of complexity, with state being perhaps the most discussed on a regular basis, in general industry discussions. The book does discuss frequently the role of CPU, memory, per-hop behavior, and link capacity in complexity, but does not include capacity in the framework. I have a tendency to include capacity in my writing, in my mental model, and I am inclined to include it in any model I have of complexity. This is a preference on my part, and no disagreement with this book’s content should be implied. Likewise, I may tend to refer to outcomes rather than optimizations. Referring to optimizations obviously has resonance with mathematics functions and may even be the right lens for an engineering discussion, but because I also think in terms of non-optimizations being part of the toolkit, traded off against more capacity, I just have a preference to use the language of outcomes. Once again, no disagreement with the content in this book should be applied. This is just language and preferences. I am in definite agreement that state, speed, and surface are important aspects of complexity.
Of the three, surface would seem to be the hardest to understand and quantify. State can be quantified & compared, and I actually would have liked it if the book did quantify it when making comparisons. Speed can also be quantified, those its consequences may not be predictable. How deep or how wide a surface is (interactions between components / modules)? That seems a little more challenging to have a consistent quantification for across all that may read this book (observation, not a criticism, it is what it is). Surface is not the only challenge. For example, while every software engineer knows what a race condition is, it doesn’t mean that every software engineer is going to recognize conditions creating a race condition, or for that matter, every network engineer making design decisions that do not even involve touching the code. There remain areas of this topic pose significant challenges in understanding, quantifying, and predicting, especially on a white board.
In the 1992 classic “Complexity: The Emerging Science at the Edge of Order and Chaos”, two important points were made. Firstly, mainstreams sciences tend to diminish the findings of simulation, preferring mathematical equations, empirical evidence, and other well-trodden aspects of the scientific method. A recent LinkedIn thread highlighted the important role simulation provided at Amazon in influencing network architecture in some parts of the network. In the domain of autonomy, including car autonomy, a feeling has emerged that not all possible conditions can be foreseen by engineers, so real-world learning should be augmented with extensive simulation, because there is something very important at stake – human life. The book does make mention of the tools gap in networking (tools available and complexity knowledge of people using tools). Given the tools gap, and the approaching era of AI/ML in networking, simulation may, IMO, become a big part of the network engineers toolset (subject to availability). The Second important point “Complexity:…” makes, is that innovative creativity lives at the edge of chaos. I suspect the authors of “Navigating Network Complexity” may agree, and if not, at least, a certain amount of complexity is inescapable and necessary, in networking.
Lastly, I have a tendency to view network analysis through the lens of information dynamics, which is not complete perhaps, but just my proclivity. I found some of my own lens implicit in the many references in the book to certainty/uncertainty, information hiding, etc. Views that of course lean on Information Theory, a subject I recommend to network analysis, at least the basic concepts.
There are many questions I would love to ask the authors, and there are many aspects of the book worthy of discussion, but that would make the review long. In summary, I would simply say, if you are interested in the subject of IP routing complexity, then read this book.
[Note: I have no affiliation with the authors or book publisher]
Some quotes from the book that stood out for me. This list is neither exhaustive, nor sufficient distillation of a book that is information dense, just a few things that resonated with me, and perhaps are worthy of noting / discussing further.
- The connection between complexity and understanding is tenuous at best.
- What really matters WRT the number of moving parts, is the interaction between those parts
- Anything that adds unnecessary ____ is perceived as complex
- By tunneling…the IP control plane’s complexity is reduced
- Complexity…arises…from design strategies intended to create robustness to uncertainty [Alderson and Doyle]
- Robust yet fragile [reminded me of the Oak & the Willow]
- Network complexity, then, cannot simply be measured, computed, and “solved”, in the traditional sense…because there is no way to measure or express intent.
- In terms of complexity, ring topologies add a lot less load to the network’s (routed or IP) control plane [always two and only two neighbors]
- The ability to create virtual topologies allows us to transfer the complexity of configuring a lot of filters into the complexity of an additional set of control plane and data primitives
- EIGRP is a distance vector protocol, no topology information is carried, only reachability information…So why does EIGRP scale so well in relatively design free networks? Because the state carried in the control plane is minimal compared to most other protocols, and because the Diffusing Update Algorithm (DUAL) does such a good job at spreading the convergence load through the network.
- …the protocol complexity didn’t appear to be as difficult to manage as redesigning every EIGRP network in the world wholesale.
- …every network that succeeds does so in much the same way as every other network in the world, but every network that fails does so in a completely unique way.”
- Complex systems, particularly highly redundant and available ones, are always in a pseudo-failure mode. There is always something wrong someplace in any truly complex system.” [Reminded me of the edge of chaos being where creativity and innovation lives.]