We Automate to Grow: The Past and Future of IP Networks

Imagine a world where IP networks did not have dynamic routing protocols. What would be the alternative? Configuring routes via a management system? Would the network with the best management system have a competitive advantage over other networks? How would that network respond to frequent topology changes? Would there be few networks that operated well, because of the management tax required to deliver desired operations outcomes?

We don’t have to answer those questions, because IP networks do have dynamic routing protocols, which “provides an automated way to maintain, over time, a common view among routers in the network, of what paths/routes are available in the network and/or the topology of the network. From this information, all routers can determine the next router a packet should be forwarded to, even after topology changes occur. As a result of this automation, IP networks can scale to large networks, with many routers, and many alternate paths/routes between sources and destinations.” It is one thing to build a network where there is a single primary route and a single backup route. It is another to build a network that optimizes over many, many, route options.

Automation was central to the Internet ideal and experience, among a large and diverse number of administrative entities, not using a common management system, with I assume scarce operational resources and a topology that frequently changed.

There is nothing wrong with centralized systems, in fact, if they can provide global optimizations, I definitely believe they should augment distributed control planes. For example, if a centralized system can recommend routes that provide better overall outcomes desired by the administrative entity, then there should be a way for that wisdom to be injected into the network.

What is or is not the “domain” of centralized systems vs distributed systems is, and perhaps always will be, an ongoing debate. Dynamic routing across different topologies is a hard problem. As with business processes, it is probably not always clear what assumptions are being made when a routing protocol is designed. That realization may only come later. In a scenario where a routing protocol is not optimal for a given topology, the question arises, what is the best way to respond? Respond by changing the routing protocol or respond with centralized management system workarounds, because if routing protocol automation is not optimal for what is being done, then automation through a centralized management system is the next best thing. Automation through centralized systems is the paradigm that defines the cloud disruption we are all experiencing in I(C)T.

In a world where routing protocols are not optimal for a given topology, for example, the dense CLOS topologies in hyperscalers / datacenters, those that can build the best-centralized management systems are the companies that will have a competitive advantage. There is nothing inherently wrong with that. In business, for example, there is an acceptance that the company that makes an investment in something differentiated in the market place, should be rewarded. OTOH, if the IP community comes to the conclusion that optimal support of a new type of topology is important for IP routing to continue its relevance and growth, then the community may decide to jump in and develop a solution that does not lean on a centralized system as might otherwise be required.

Routing has evolved beyond its initial roots. No longer do we think in terms of simply understanding routes and topology for a single layer. Networks support Ethernet and IP VPNs, network slicing work is ongoing in the IETF, MPLS and Segment Routing are part of the modern landscape, IPv6 hosts are growing, and BGP, differentiated from IGPs by its strong eBGP-driven policy infrastructure, is finding more and more uses, well beyond its original purpose. Other differences between the path-vector-based BGP and link-state protocols (IS-IS, OSPF) are also debated. In addition, Netconf/Yang, gRPC/gNMI, and to some extent PCEP, all suggest a future programmable and instrumented network where centralized systems play a larger role than they have in the past; they are the tools of the cloud paradigm.

One of the reasons the Internet is the most successful network on the planet is because a critical aspect, route/topology discovery and route convergence (loop avoidance/mitigation) are automated. Outside of that, there are numerous complicated configurations involved in an IP network. Automation led to initial phases of growth, further automation, distributed and/or centralized, will lead to more growth. Growth for networks, and growth for individuals, moving from the grunt work of basic configuration and analysis to the higher-value work of networks realizing business/entity outcomes. How much of this automation should be distributed, how much should be centralized, and what is the role of each, remains one of the important debates of our time; it has always been a debate in networking, and perhaps always will be; a debate that is not independent of the ever-changing nature of distributed & centralized CPU/NPU/memory resources.

In our imaginations, there is a future zero-touch network, where everything is discovered, and nothing is manually configured. To channel my inner JFK, “We choose…not because they are easy, but because they are hard, because that goal will serve to organize and measure the best of our energies and skills, because that challenge is one that we are willing to accept, one we are unwilling to postpone, and one which we intend to win, and the others, too.”

We automate to grow. We automate because that will serve to organize and measure the best of our energies.

This site uses Akismet to reduce spam. Learn how your comment data is processed.