New data center network architecture offers latency and throughput
The rate of data traffic within data centers is rapidly growing along with the services they support. In Google's case, bandwidth requirements have roughly doubled year over year within its data centers. To that end, a great deal of progress on the hardware side has been made. With switches, the Tomahawk 4 just achieved 25.6 Terabits/sec. Transceiver technology recently climbed past 400G.
By 2021, about 95% of all data center traffic will originate from the cloud, and within cloud applications most of the packets are below 500 bytes. As you get smaller, you need faster switching to match. Unfortunately, networks still struggle with latency. As the data center system scales up, the electronic packet switched networks in use experience "long tail latency" that commonly gets up to hundreds of milliseconds and above, orders of magnitude higher than median latency values. To elaborate, where normally a latency spike on the fringes for 1 out of 100 users wouldn't be an issue, but when 1% of your users becomes thousands it becomes a real problem.
A recently published architecture named PULSE offers an innovative solution. Benjamin et al designed an optical circuit switched network that's controlled by distributed hardware schedulers. When modeled on MATLAB, The architecture's average latency was about 1 microsecond and tail latency was about 5 microseconds. Its throughput was a staggering 25.6 Peta bits per second when you factor in tuning overhead, though the instantaneous node-to-node limit is 100Gbps.
This is accomplished by a few key features of the network. Parallel star couplers were used, which allow light to transmit equally from any port to all of the other connected ports. With 64 nodes per rack and 64 racks in total, each node has multiple transceivers to facilitate sub-networks. Each transceiver connects its node to a different sub-network via a different star coupler. During data transfer a transmitter and receiver are tuned to the same time slot and wavelength. Thus for every coupler there is a corresponding node scheduler in the same rack that handles source-destination rack pairs. Additionally, requests are sent to the scheduler several epochs (cycle durations) in advance. An innovative scheduling algorithm calculates a new wavelength for each circuit cycle. A key feature of the architecture is its nanosecond circuit reconfiguration speed.
This unique setup allows wavelengths to be re-used since the sub-networks are completely independent. As a result, the network can support over a quarter million channels. further, the system allows for 100% wavelength usage. Buffering, addressing, and in-network switching isn't needed with this architecture. However, it does require extremely rapid filtering, scheduling, data recovery, tunable wavelength switching, and synchronization. Under this layout, nodes can share resources effectively and bottlenecks are minimized.
One of the more surprising findings is that it's actually quite cost effective relative to current network architectures at roughly $5/Gbps. Facilitating this, the network only consumed 82 picojoules per bit. The cost of transceivers is dropping, which would further benefit a system like PULSE. Additionally, during a data center refresh cycle only an end node transceiver upgrade would be necessary, leading to greater cost savings.