HULA: Scalable Load Balancing Using Programmable Data Planes

Authors:
Naga Katta

Princeton University

Princeton University
View Profile

,
Mukesh Hira

VMware

VMware
View Profile

,
Changhoon Kim

Barefoot Networks

Barefoot Networks
View Profile

,
Anirudh Sivaraman

MIT CSAIL

MIT CSAIL
View Profile

,
Jennifer Rexford

Princeton University

Princeton University
View Profile

SOSR '16: Proceedings of the Symposium on SDN ResearchMarch 2016Article No.: 10Pages 1–12https://doi.org/10.1145/2890955.2890968

Published:14 March 2016Publication History

SOSR '16: Proceedings of the Symposium on SDN Research

Pages 1–12

ABSTRACT

Datacenter networks employ multi-rooted topologies (e.g., Leaf-Spine, Fat-Tree) to provide large bisection bandwidth. These topologies use a large degree of multipathing, and need a data-plane load-balancing mechanism to effectively utilize their bisection bandwidth. The canonical load-balancing mechanism is equal-cost multi-path routing (ECMP), which spreads traffic uniformly across multiple paths. Motivated by ECMP's shortcomings, congestion-aware load-balancing techniques such as CONGA have been developed. These techniques have two limitations. First, because switch memory is limited, they can only maintain a small amount of congestion-tracking state at the edge switches, and do not scale to large topologies. Second, because they are implemented in custom hardware, they cannot be modified in the field.

This paper presents HULA, a data-plane load-balancing algorithm that overcomes both limitations. First, instead of having the leaf switches track congestion on all paths to a destination, each HULA switch tracks congestion for the best path to a destination through a neighboring switch. Second, we design HULA for emerging programmable switches and program it in P4 to demonstrate that HULA could be run on such programmable chipsets, without requiring custom hardware. We evaluate HULA extensively in simulation, showing that it outperforms a scalable extension to CONGA in average flow completion time (1.6 x at 50% load, 3 x at 90% load).

References

N. Kang, Z. Liu, J. Rexford, and D. Walker, "Optimizing the "one big switch" abstraction in software-defined networks," CoNEXT '13, (New York, NY, USA), ACM. Google ScholarDigital Library
M. Alizadeh and T. Edsall, "On the data path performance of leaf-spine datacenter fabrics," in HotInterconnects 2013, pp. 71--74. Google ScholarDigital Library
J. Perry, A. Ousterhout, H. Balakrishnan, D. Shah, and H. Fugal, "Fastpass: A centralized "zero-queue" datacenter network," SIGCOMM, 2014, (New York, NY, USA), pp. 307--318, ACM. Google ScholarDigital Library
V. Jeyakumar, M. Alizadeh, D. Mazières, B. Prabhakar, C. Kim, and A. Greenberg, "Eyeq: Practical network performance isolation at the edge," NSDI 2013, (Berkeley, CA, USA), pp. 297--312, USENIX Association. Google ScholarDigital Library
L. Popa, A. Krishnamurthy, S. Ratnasamy, and I. Stoica, "Faircloud: Sharing the network in cloud computing," HotNets-X, (New York, NY, USA), pp. 22:1--22:6, ACM, 2011. Google ScholarDigital Library
M. Alizadeh, S. Yang, M. Sharif, S. Katti, N. McKeown, B. Prabhakar, and S. Shenker, "pfabric: Minimal near-optimal datacenter transport," SIGCOMM 2013, (New York, NY, USA), pp. 435--446, ACM. Google ScholarDigital Library
M. Chowdhury, Y. Zhong, and I. Stoica, "Efficient coflow scheduling with varys," in Proceedings of the 2014 ACM Conference on SIGCOMM, SIGCOMM '14, (New York, NY, USA), pp. 443--454, ACM, 2014. Google ScholarDigital Library
M. Al-Fares, S. Radhakrishnan, B. Raghavan, N. Huang, and A. Vahdat, "Hedera: Dynamic flow scheduling for data center networks," NSDI 2010, (Berkeley, CA, USA), pp. 19--19, USENIX Association. Google ScholarDigital Library
T. Benson, A. Anand, A. Akella, and M. Zhang, "Microte: Fine grained traffic engineering for data centers," CoNEXT 2011, pp. 8:1--8:12, ACM. Google ScholarDigital Library
J. Cao, R. Xia, P. Yang, C. Guo, G. Lu, L. Yuan, Y. Zheng, H. Wu, Y. Xiong, and D. Maltz, "Per-packet load-balanced, low-latency routing for clos-based data center networks," CoNEXT 2013, pp. 49--60, ACM. Google ScholarDigital Library
S. Kandula, D. Katabi, S. Sinha, and A. Berger, "Dynamic load balancing without packet reordering," SIGCOMM Comput. Commun. Rev., vol. 37, pp. 51--62, Mar. 2007. Google ScholarDigital Library
S. Sen, D. Shue, S. Ihm, and M. J. Freedman, "Scalable, optimal flow routing in datacenters via local link balancing," CoNEXT 2013, pp. 151--162, ACM. Google ScholarDigital Library
M. Alizadeh, T. Edsall, S. Dharmapurikar, R. Vaidyanathan, K. Chu, A. Fingerhut, V. T. Lam, F. Matus, R. Pan, N. Yadav, and G. Varghese, "Conga: Distributed congestion-aware load balancing for datacenters," SIGCOMM Comput. Commun. Rev., vol. 44, pp. 503--514, Aug. 2014. Google ScholarDigital Library
C.-Y. Hong, S. Kandula, R. Mahajan, M. Zhang, V. Gill, M. Nanduri, and R. Wattenhofer, "Achieving high utilization with software-driven wan," SIGCOMM 2013, pp. 15--26, ACM. Google ScholarDigital Library
S. Jain, A. Kumar, S. Mandal, J. Ong, L. Poutievski, A. Singh, S. Venkata, J. Wanderer, J. Zhou, M. Zhu, J. Zolla, U. Hölzle, S. Stuart, and A. Vahdat, "B4: Experience with a globally-deployed software defined wan," SIGCOMM 2013, pp. 3--14, ACM. Google ScholarDigital Library
P. Bosshart, G. Gibb, H.-S. Kim, G. Varghese, N. McKeown, M. Izzard, F. Mujica, and M. Horowitz, "Forwarding Metamorphosis: Fast Programmable Match-action Processing in Hardware for SDN," in SIGCOMM, 2013. Google ScholarDigital Library
"Intel FlexPipe." http://www.intel.com/content/dam/www/public/us/en/documents/product-briefs/ethernet-switch-fm6000-series-brief.pdf.Google Scholar
"Cavium and XPliant introduce a fully programmable switch silicon family scaling to 3.2 terabits per second." http://tinyurl.com/nzbqtr3.Google Scholar
P. Bosshart, D. Daly, G. Gibb, M. Izzard, N. McKeown, J. Rexford, C. Schlesinger, D. Talayco, A. Vahdat, G. Varghese, and D. Walker, "P4: Programming protocol-independent packet processors," SIGCOMM Comput. Commun. Rev., vol. 44, pp. 87--95, July 2014. Google ScholarDigital Library
T. Issariyakul and E. Hossain, Introduction to Network Simulator NS2. Springer Publishing Company, Incorporated, 1st ed., 2010. Google ScholarDigital Library
R. Niranjan Mysore, A. Pamboris, N. Farrington, N. Huang, P. Miri, S. Radhakrishnan, V. Subramanya, and A. Vahdat, "Portland: A scalable fault-tolerant layer 2 data center network fabric," SIGCOMM 2009, pp. 39--50, ACM. Google ScholarDigital Library
"Cisco's massively scalable data center." http://www.cisco.com/c/dam/en/us/td/docs/solutions/Enterprise/Data_Center/MSDC/1-0/MSDC_AAG_1.pdf, Sept 2015.Google Scholar
"High Capacity StrataXGS®Trident II Ethernet Switch Series." http://www.broadcom.com/products/Switching/Data-Center/BCM56850-Series.Google Scholar
S. Hu, K. Chen, H. Wu, W. Bai, C. Lan, H. Wang, H. Zhao, and C. Guo, "Explicit path control in commodity data centers: Design and applications," NSDI 2015, pp. 15--28, USENIX Association. Google ScholarDigital Library
A. Greenberg, J. R. Hamilton, N. Jain, S. Kandula, C. Kim, P. Lahiri, D. A. Maltz, P. Patel, and S. Sengupta, "Vl2: A scalable and flexible data center network," SIGCOMM Comput. Commun. Rev., vol. 39, pp. 51--62, Aug. 2009. Google ScholarDigital Library
C. Guo, G. Lu, D. Li, H. Wu, X. Zhang, Y. Shi, C. Tian, Y. Zhang, and S. Lu, "Bcube: A high performance, server-centric network architecture for modular data centers," SIGCOMM 2009, pp. 63--74, ACM. Google ScholarDigital Library
E. Athanasopoulou, L. X. Bui, T. Ji, R. Srikant, and A. Stolyar, "Back-pressure-based packet-by-packet adaptive routing in communication networks," IEEE/ACM Trans. Netw., vol. 21, pp. 244--257, Feb. 2013. Google ScholarDigital Library
B. Awerbuch and T. Leighton, "A simple local-control approximation algorithm for multicommodity flow," pp. 459--468, 1993. Google ScholarDigital Library
"P4 Specification." http://p4.org/wp-content/uploads/2015/11/p4-v1.1rc-Nov-17.pdf.Google Scholar
S. Radhakrishnan, M. Tewari, R. Kapoor, G. Porter, and A. Vahdat, "Dahu: Commodity switches for direct connect data center networks," ANCS 2013, pp. 59--70, IEEE Press. Google ScholarDigital Library
A. Sivaraman, M. Budiu, A. Cheung, C. Kim, S. Licking, G. Varghese, H. Balakrishnan, M. Alizadeh, and N. McKeown, "Packet transactions: A programming model for data-plane algorithms at hardware speed," CoRR, vol. abs/1512.05023, 2015.Google Scholar
"Protocol-independent switch architecture." http://schd.ws/hosted_files/p4workshop2015/c9/NickM-P4-Workshop-June-04-2015.pdf.Google Scholar
"Members of the p4 consortium." http://p4.org/join-us/.Google Scholar
"P4's action-execution semantics and conditional operators." https://github.com/anirudhSK/p4-semantics/raw/master/p4-semantics.pdf.Google Scholar
Private communication with the authors of CONGA.Google Scholar
M. Alizadeh, A. Greenberg, D. A. Maltz, J. Padhye, P. Patel, B. Prabhakar, S. Sengupta, and M. Sridharan, "Data center tcp (dctcp)," SIGCOMM 2010, pp. 63--74, ACM. Google ScholarDigital Library
K. He, E. Rozner, K. Agarwal, W. Felter, J. Carter, and A. Akella, "Presto: Edge-based load balancing for fast datacenter networks," in SIGCOMM, 2015. Google ScholarDigital Library
C. Raiciu, S. Barre, C. Pluntke, A. Greenhalgh, D. Wischik, and M. Handley, "Improving datacenter performance and robustness with multipath tcp," SIGCOMM 2011, pp. 266--277, ACM. Google ScholarDigital Library
W. Bai, L. Chen, K. Chen, D. Han, C. Tian, and H. Wang, "Information-agnostic flow scheduling for commodity data centers," NSDI 2015, pp. 455--468, USENIX Association. Google ScholarDigital Library
D. Zats, T. Das, P. Mohan, D. Borthakur, and R. Katz, "Detail: Reducing the flow completion time tail in datacenter networks," SIGCOMM 2012, pp. 139--150, ACM. Google ScholarDigital Library
S. Kandula, D. Katabi, B. Davie, and A. Charny, "Walking the tightrope: Responsive yet stable traffic engineering," SIGCOMM 2005, pp. 253--264, ACM. Google ScholarDigital Library
A. Elwalid, C. Jin, S. Low, and I. Widjaja, "Mate: Mpls adaptive traffic engineering," in IEEE INFOCOM 2001, pp. 1300--1309 vol. 3.Google Scholar
N. Mchael and A. Tang, "Halo: Hop-by-hop adaptive link-state optimal routing," Networking, IEEE/ACM Transactions on, vol. PP, no. 99, pp. 1--1, 2014.Google Scholar
R. Gallager, "A minimum delay routing algorithm using distributed computation," Communications, IEEE Transactions on, vol. 25, pp. 73--85, Jan 1977.Google ScholarCross Ref

HULA: Scalable Load Balancing Using Programmable Data Planes
1. Networks

Recommendations

SilkRoad: Making Stateful Layer-4 Load Balancing Fast and Cheap Using Switching ASICs
SIGCOMM '17: Proceedings of the Conference of the ACM Special Interest Group on Data Communication

In this paper, we show that up to hundreds of software load balancer (SLB) servers can be replaced by a single modern switching ASIC, potentially reducing the cost of load balancing by over two orders of magnitude. Today, large data centers typically ...
Read More
MP-HULA: Multipath Transport Aware Load Balancing Using Programmable Data Planes
NetCompute '18: Proceedings of the 2018 Morning Workshop on In-Network Computing

Datacenter networks offer a large degree of multipath in order to provide large bisectional bandwidth. The end-to-end performance is determined by the load-balancing strategy which needs to be designed to effectively manage congestion. Consequently, ...
Read More
Efficient congestion avoidance mechanism
LCN '00: Proceedings of the 25th Annual IEEE Conference on Local Computer Networks

Increasing uncontrolled best-effort traffic deteriorates the ability of TCP to control congestion and is a source of high drop rates. This paper proposes an efficient congestion avoidance mechanism (ECAM) suitable for uncontrolled unicast and multicast ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SOSR '16: Proceedings of the Symposium on SDN Research
March 2016
178 pages
ISBN:9781450342117
DOI:10.1145/2890955
Program Chairs:
Brighten Godfrey
UIUC
,
Martin Casado
VMware
Copyright © 2016 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 14 March 2016
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
In-Network Load Balancing
Network Congestion
Programmable Switches
Scalability
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
Overall Acceptance Rate7of43submissions,16%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 251
  Total Citations
  View Citations
- 3,188
  Total Downloads
- Downloads (Last 12 months)497
- Downloads (Last 6 weeks)65
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HULA: Scalable Load Balancing Using Programmable Data Planes

SOSR '16: Proceedings of the Symposium on SDN Research

ABSTRACT

References

Cited By

Recommendations

SilkRoad: Making Stateful Layer-4 Load Balancing Fast and Cheap Using Switching ASICs

MP-HULA: Multipath Transport Aware Load Balancing Using Programmable Data Planes

Efficient congestion avoidance mechanism