Opinion: Automating the data center with IBN
FYI, this story is more than a year old
Apstra CEO Mansour Karam says today’s hyperscale, leaf-spine switching data centres are so complex that established tools and techniques are no longer adequate to engineer, configure and manage their networks. In light of this, Karam explains below how Intent-Based Networking (IBN) adapts automation and AI to leap that gap between business intent and network complexity.
Joe Skorupa, VP Distinguished Analyst at Gartner Data Centre Convergence, said: “I have known major financial organisations make multi-million dollar investments only to rip-and-replace them the very next day if a technology comes along that improves their competitive edge… However the network hasn’t really changed in the last few decades because network folk are conservative. The reasons are quite clear: if a server in a data center fails, your application goes down; but if your network goes down your entire data centre goes down”.
Data centre operators are still looking for better ways to maximise uptime and increase operational agility while reducing operating costs. The good news is that there have been significant changes in the networking world, changes that I anticipated in June 2016 in my opening blog for Apstra.
A network is the ultimate in loosely-coupled highly-interdependent distributed multi-processor computing: anything, anywhere can affect anything else in ways that are far from obvious.
So I wrote: “The key to simplifying operations is to run the network as a system, as opposed to box by box. Networking systems are a distributed set of equipment running distributed protocols and routing applications. Network engineering requires complex control and visibility of this distributed set of equipment. Therefore, a distributed systems approach is required.”
The way networks work had not changed over the last 20 to 30 years. At Apstra we had to rethink the entire network concept in order to deliver order of magnitude improvements in capability and total cost of ownership.
The leaf spine advantage
Since then, there has been a major shift in data centre architecture.
According to Joe Skorupa: “Anything being built today with very little exception is all leaf spine – I would say 90 percent of what is being deployed today is leaf spine. The reason for the move to leaf spine is a change in application architecture – in the old three tier architecture it was a very North / South traffic flow - very oversubscribed. The number of apps has grown hugely so there is a lot more traffic happening within the data centre – East / West traffic - and leaf spine does a very good job of providing a very high bandwidth, low-latency relatively easy to manage network that does that”.
In leaf-spine (or ‘clos’) topologies – now the de facto standard – all east-west hosts are at equal distances. Applications behave more predictably because traffic between hosts on different leaf switches only has to cross the ingress leaf switch, spine switch and egress leaf switch. This is vital for organisations running multi-tiered web applications, high-performance computing clusters or high-frequency trading.
The traditional three-layer design uses spanning-tree loop prevention protocol. It detects loops and blocks links forming over the loop.
This means that dual-homed access switches only use one of their two uplinks. Alternative protocols, such as SPB and TRILL, allow all links between leaf and spine to forward traffic, so the network can scale as traffic grows.
Leaf-spine networks allow interconnections to spread across a large number of spine switches, obviating the need for massive chassis switches. Chassis switches can still be used in the spine layer, but many organisations are saving cost by deploying fixed-switch spines.
Another major shift we anticipated back in 2016 was towards Application Programming Interfaces (APIs).
In that same blog I explained how more innovative customers were demanding that networking devices from leading vendors should be programmed through published APIs: “CLI wars are coming to an end and the landscape has shifted to a battle of APIs”.
For a very long time there had only been vertically integrated stacks of closed systems. Then the demand grew for open interfaces and disaggregation – hyperscalers were the first to realise the need for automated infrastructures.
Since then every vendor has responded. Now we have APIs that did not exist just five years ago – both to configure devices and to collect telemetry from devices. These APIs enable a programmable infrastructure so we now can automate our infrastructures using very powerful technologies, like machine learning and advanced analytics.
Intent-based networking is a method of automating your infrastructure in depth across the entire lifecycle: not just pushing out different configurations, but also bringing in advanced analytics and continuous validation to ensure that your network is behaving and continues to behave just as expected.
When you give a self-driving car instructions on where to go, that's your intent. The simpler the intent description – eg “take me back home” – the more sophisticated the software needs to be to interpret the intent correctly and carry it out faultlessly. Intent-based networking means operating networks as a system the same way a self-driving car operates a car as a system, not a set of individual components.
In a network context, intent is the outcome the network engineer wants without going deep into configuring each specific network device. IBN systems automatically convert a high-level description of desired network behaviour (according to business-level rules) into all the necessary low-level configuration data – which is then applied to elements in the underlying network infrastructure.
Not only is less time spent planning the necessary configuration of hundreds of devices, the resulting changes can then be rolled out over the networks’ control plane. This removes two levels of opportunity for human error. Recent figures suggest that 75% of data center downtime is still due to common human error.
Of course it is then essential to monitor the resulting changes to make sure that they conform securely and reliably to the original intent – without compromising other functions on the network.
IBN does this in real time: using streaming telemetry and real-time analytics to continuously validate that the current network state remains consistent with the specified intent. It can then report exceptions or automatically reconfigure according to company policies.
Organisations today are embracing digital transformation in many forms, including IoT, virtual reality, 5G, and machine learning. Every one of those technologies depend upon having a scalable, reliable and agile network at its heart.
IBN is a new technology with only a handful of companies selling solutions but it is attracting wider attention now that Moore’s Law has delivered the necessary compute power to drive its engines.
It is the way to go, as Joe Skorupa explains: “The strength of intent-based networking is that finally we can generate a network design that is mathematically provable to be correct and to continually monitor the network to ensure that it remains mathematically correct. Then at the very best you get notified if something is wrong and at the very worst mediation kicks in and you’ve got a closed loop."