Hyper-Scale DC
The Cloud is bringing a level of scale and explosive growth in the DC and the network that we have never seen before. And we are just at the beginning.
Cost is paramount to stay in the game. At hyper-scale, “brute force” does not work anymore, capacity in the DC is NOT “plenty and cheap." We need to scale at
-
To solve these formidable challenges, we have designed
Hierarchical SDN (HSDN) , an architectural solution that achieves hyper scale using surprisingly small forwarding tables in the network nodes. HSDN introduces a fundamentally new paradigm for the forwarding and control planes, in thatall paths in the network are pre-established in the forwarding tables andthe labels identify entire paths rather than simply destinations. -
These properties of HSDN dramatically simplify establishing
tunnels , and thus enable optimal handling of bothECMP and any-to-any end-to-endTE , which in turn yields extremely high network utilization with small buffers in the switches. - HSDN is suitable for a full SDN implementation, using a scalable SDN controller to configure all forwarding tables in the network nodes and in the endpoints, as well as a hybrid approach, using conventional routing protocols in conjunction with a controller.
- Our paper, "Hierarchical SDN for the Hyper-Scale, Hyper-Elastic Data Center and Cloud" at ACM SIGCOMM SOSR'15, gives a good overview of HSDN.
- “Hierarchical SDN for Hyper-scale Data Centers” (draft-fang-mpls-hsdn-for-hsdc) and "BGP-LU for HSDN Label Distribution” (draft-fang-idr-bgplu-for-hsdn) are the two main IETF drafts on HSDN.
Network Overlay, Virtualization, Elasticity, and Service Availability
Cloud providers leverage virtualization technologies to enable customers to create VNs in the cloud and access compute and storage resources on demand to meet their applications’ needs dynamically.
- Elasticity, service velocity, agility, and service availability are key competitive battlegrounds for cloud providers.
- The ability of moving customer VMs and VNFs within the DC and across DCs is essential for cloud providers to give customers more elastic access of computing, storage, and network resources, manage capacity effectively in the DC, improve service velocity, and hide failures.
-
Achieving scalable, seamless
VM and VNF mobility at scale brings yet another set of computational, convergence, and networking challenges, which requires new mechanisms to be devised in both overlay and underlay.
Traffic Engineering
With its level of scale and complexities, the Cloud introduces new challenges in handling traffic and congestion.
Today, ECMP is the main mechanism in the DC to avoid congestion, but
- There has been spirited debate on whether extensive use of TE in the DC is helpful and needed. Today, any-to-any, end-to-end (i.e., server-to-server) TE is simply unfeasible at scale for several reasons: TE may cause routing/forwarding table explosion; TE path and bandwidth allocation computation is a NP-complete problem; signaling of the TE tunnels, say, using RSVP-TE, is heavy and takes time. As a consequence, today, TE is used primarily in the DCI/WAN, and only very sparingly in the DC.
- HSDN removes many of the issues that prevented extensive use of any-to-any, end-to-end TEin the DC and DCI/WAN and introduces the foundations for a new era of TE.
