WHY AND HOW CHOICES ARE MADE IN NETWORK ARCHITECTURE
The Three Olive Martini model is a network design model that focuses on the forces that compel network designers to make choices (the “why”), and the optionality & tradeoffs of choices (the “how”). Optionality starts at a theoretical level, and is then narrowed down to the practical/likely subset based on organization structure, cultures, and preferences.
Figure 1. Three Olive Network Architecture Martini: Why and How Choices are Made
There are thee basic areas where optionality exists, and between these areas, tradeoffs are made.
The three areas are:
- capacity, capability and quality of services supported by a network
- capacity and capability of network operations
- capacity and capability of the network itself
Technology, budgets, and organizational dynamics all play a role in optionality and tradeoffs.
Outcomes & Return
See previous articles:
Optionality and Tradeoffs
One approach to architecture and design can be:
- What is the total universe of options
- What options are realistically implementable
- Of the remaining options, which option best achieves the desired outcome with the least “cost”, where cost is a proxy for multiple decision points.
Figure 2. The bottom half of the martini glass – how decisions are made.
Figure 2. depicts the significant areas of tradeoff that are useful in discussing how network architecture decisions are made. The “bottom half” of the “martini glass” deals with where tradeoffs can be made, and the optionality within the major areas of tradeoffs:
- Service capabilities, capacity, and quality
- Operations capabilities and capacity
- Network capabilities and capacity
Note on optionality: Optionality has emerged in (business) strategy as a subject with supporters and detractors. Keeping options open can be a way of never fully committing to a strategy and therefore never achieving extraordinary returns. This article is not an argument for or against “keeping options open”. This article simply seeks to discuss what is the range of options a network operator might consider, and where tradeoffs exist.
See previous article:
Capacity is arguably the biggest lever in network design. With enough capacity, many challenges dissolve. As with any problem in an economic system.
However, capacity is often finite, due to technological, budgetary, or operational limitations. Capacity refers to not only the most obvious of resources, the bandwidth between network nodes, but also compute and storage resources that are directly impacted by the volume, velocity, and variation of network state/information, especially control plane state.
In practice, capacity is a finite resource, but capacity can be raised to a limit that alleviates some network design challenges, even if there still remains a capacity limit. Whether to increase, maintain, or even decrease capacity is a decision to be made in all design.
Capacity within finite systems
- Could the difference between whether a lightweight traffic engineering approach and a heavyweight approach be switching and link capacity?
- Could the difference between large IGP areas and small IGP areas be the amount of memory in each router within the area?
- Could the difference between relative stability and chaos during failure recovery/convergence be the size of the route engines in each router within an IGP area?
The answer to all of the above is “Yes”, in theory. In practice, it can sometimes be hard to know without the use of a creditable simulation tool what the real options and tradeoffs are. Without such tools, network designers may prefer to be conservative, and not take risks, employing multiple, defensive design elements: areas, summarization, traffic engineering, quality of service,…
Figure 3a. Defensive Network Design
Figure 3b. Optimization opportunities across risk, cost, and outcomes
While capacity is to some extent in the hands of the network designer, assuming available products / services support desired scale & the budget exists, in the right budget, capabilities is a slightly more complex beast, that includes multiple industry players including technology suppliers and standards bodies.
Sometimes it is the case that there can be a tradeoff between the capability in a router and the capability in a management system, or more frequently today, a controller, and/or some new activity for operations. Who makes tradeoffs such as these when technology is being developed: the standards body, the technology supplier? How would a technology supplier know the tradeoff that any single customer wants? Have technology suppliers been historically aware of the tradeoffs they were making between equipment / software design and a customer’s operations? Presumably, in the next epoch of networking where automation/autonomy is a greater focus, these kinds of tradeoffs will get more attention.
That said, the general industry trend is towards simplifying the network itself, and putting any necessary complexity into operational systems. That approach was popularized in managing servers, and to some extent storage, in cloud environments. Network automation, is in many ways behind automation of other IT resources. Is this because network suppliers are dragging their heels? That is a debate. There is also something else that has been discovered on the journey to network automation.
While a server is to some extent a standalone unit / resource, the same is not as true for a router. A router, is one node, in a distributed control plane, that is impacted by control plane events occurring in other parts of the network. So for networks, where should the simplicity be: the interface to a single router, or the interface to a network of routers? New initiatives like RIFT (routing in fat trees) are exploring this question.
Additional capability can be viewed as additional complexity. Should that complexity be in the transport underlay, the service overlay, or the operations environment? These are the kinds of tradeoffs network designers have to consider, if they have the optionality to do so.
See previous articles:
See previous articles:
Budgets are a huge impact on what happens in networks. Whether capacity and capabilities are weighted towards the network or operations, the scope and scale of services, and ultimately overall quality. How budgets are structured within an organization, and how they are managed, can impact network architecture/design choices. Can investment freely transfer from one budget to another, to facilitate design choices? Are some budgets CAPEX and others OPEX, with different budget owners? Is new equipment only purchased, using CAPEX, once every 3-5 years?
Few things impact optionality as much as budget structure and culture. As a general rule, organizations with the most budgeting flexibility will have the most design optionality. The scale of a network and the degree of specialization are both factors that influence budget rigidity. DevOps/NetOps are opportunities to explore budget fluidity.
Network and Operations tradeoffs
Network architects, engineers, and operations, hereafter referred to as “network designers”, should be able to consider a broad scope of optionality and tradeoffs, for example figure 4.
Figure 4. Operations & Network, Options and Tradeoffs (theoretical)
However, many industry conversations look more like Figure 5, and perhaps not even that sometimes, with discussion about network capabilities only.
Figure 5. Operations & Network, Options and Tradeoffs (real world)
Why does the total options for network architects / designers get pruned down to so few:
- Network operator silos
- Standards body silos
- Supplier silos
- Budget structures
- The new bandwagon effect
- Uncertainty about new approaches
Few of the above are about technology. Most of them are about people and organizational structure. People going about their everyday lives, executing the immediate, as they are often pressured to do, and not having the strength, will, or capability to jump over the hurdles placed in front of them.
This is where management and leadership play a role. In networking, operations used to be this device/fault management function that kept the network up once it was installed and configured. Now, in the age of cloud, network operations is a strategic lever in creating new options and value, as the industry has seen with some classes of networking. The amount of cloud-based compute & storage capacity that can be easily, and elastically, thrown at computational/data-intense problems, now far exceeds what is available on any single router, and even any network of routers.
We are in the early stages of exploring this new option for control, management, and service planes. The networking industry has already developed an intuition that automation is critically important, and autonomy, will be a possibility, one day. These new possibilities have and will continue to integrate software development and AI/Machine learning into operations.
Network designers will have the most optionality, and the most opportunity to creatively achieve outcomes, where the barriers to change are lowest. First and foremost, structural barriers like budget structures.
With that, next is a look at some specific options and tradeoffs
Network Capacity vs Network & Operations Capabilities
Significant network and operations functionality is related to how much capacity is in the network: compute, storage, and link/switching.
Figure 6. Operations & Network, Capabilities & capacities
This is the tradeoff that network professionals are perhaps most conscious of. The reality of finite resources drive many networking capabilities, including quality of service (QoS), (Un) Equal Cost Multipath (ECMP), and traffic engineering. Finite compute and storage resources also drive concerns about the speed and size of network state/information, ultimately leading to methods such as summarization and areas. Many of the concerns of network designers are rooted in finite resources.
Figure 7. What would, would not be needed if capacity was infinite
Figure 7 draws attention to some of the impacts of finite resources, the need for networking capabilities such as QoS, ECMP, TE, and state management. At the same time, if resources were infinite, then operations capacity would need to increase, perhaps infinitely, to install, setup, and maintain those resources. In real-world economic systems, there are no free lunches. One alternative to requiring infinite operations resources might be zero touch provisioning, automation, and autonomy. That becomes another source of complexity in network equipment and/or operations software / systems. So it might be possible to reduce bandwidth/packet and control plane management complexity, at the cost of increased capacity costs and increased operational complexity. In fact, such a statement implies and theorizes that we may already be seeing this in some hyperscaler designs.
Network and Service Quality Tradeoffs
The old joke about “Ma Bell” was that networks were literally “gold plated”. It was a reference to the great lengths telcos went to, to achieve high-levels of service quality. This desire was driven by both culture and regulatory pressures. IP networking was disruptive to that context, and everyone should remember, the Internet was an overlay to then prevailing underlay.
Service quality is ultimately a business model / value proposition question. One operator might decide they want to pursue the part of the market that is willing to pay a premium price for the highest possible quality, while another network operator may decide to pursue that part of the market that is willing to tradeoff some quality for a lower price. This kind of economic / business tradeoff was discussed in the article: Financing the Network.
Figure 8. Options and tradeoffs between Service Quality and the Network
Tradeoffs in this area of design are perhaps the most obvious and most familiar, to most people. Does a network operator dramatically over provision the bandwidth/switching capacity of a network, or does a network operator deploy one or more network capabilities, including, but not limited to:
- Priority-based forwarding
- Policy-based queuing
- Policy-based routing
- Multiple virtual topologies
- Traffic engineering
One or more of the above may also have capabilities, capacity, and complexity implications for operations as well.
This article has provided an introduction to the “how” of the three-olive martini model for network architecture / design. Additional examples of optionality and tradeoffs will be explored in future articles.
For a given outcome, complexity cannot be reduced beyond what is necessary. However, where complexity resides, is driven by numerous design choices, the limitations of budget structures, and organizational dynamics. Basic optionality and tradeoffs reside in service quality & service, operations, and network capacity / capabilities.