skip to main content
10.1145/2890955.2890968acmconferencesArticle/Chapter ViewAbstractPublication PagescommConference Proceedingsconference-collections
research-article
Public Access

HULA: Scalable Load Balancing Using Programmable Data Planes

Published:14 March 2016Publication History

ABSTRACT

Datacenter networks employ multi-rooted topologies (e.g., Leaf-Spine, Fat-Tree) to provide large bisection bandwidth. These topologies use a large degree of multipathing, and need a data-plane load-balancing mechanism to effectively utilize their bisection bandwidth. The canonical load-balancing mechanism is equal-cost multi-path routing (ECMP), which spreads traffic uniformly across multiple paths. Motivated by ECMP's shortcomings, congestion-aware load-balancing techniques such as CONGA have been developed. These techniques have two limitations. First, because switch memory is limited, they can only maintain a small amount of congestion-tracking state at the edge switches, and do not scale to large topologies. Second, because they are implemented in custom hardware, they cannot be modified in the field.

This paper presents HULA, a data-plane load-balancing algorithm that overcomes both limitations. First, instead of having the leaf switches track congestion on all paths to a destination, each HULA switch tracks congestion for the best path to a destination through a neighboring switch. Second, we design HULA for emerging programmable switches and program it in P4 to demonstrate that HULA could be run on such programmable chipsets, without requiring custom hardware. We evaluate HULA extensively in simulation, showing that it outperforms a scalable extension to CONGA in average flow completion time (1.6 x at 50% load, 3 x at 90% load).

References

  1. N. Kang, Z. Liu, J. Rexford, and D. Walker, "Optimizing the "one big switch" abstraction in software-defined networks," CoNEXT '13, (New York, NY, USA), ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. M. Alizadeh and T. Edsall, "On the data path performance of leaf-spine datacenter fabrics," in HotInterconnects 2013, pp. 71--74. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. J. Perry, A. Ousterhout, H. Balakrishnan, D. Shah, and H. Fugal, "Fastpass: A centralized "zero-queue" datacenter network," SIGCOMM, 2014, (New York, NY, USA), pp. 307--318, ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. V. Jeyakumar, M. Alizadeh, D. Mazières, B. Prabhakar, C. Kim, and A. Greenberg, "Eyeq: Practical network performance isolation at the edge," NSDI 2013, (Berkeley, CA, USA), pp. 297--312, USENIX Association. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. L. Popa, A. Krishnamurthy, S. Ratnasamy, and I. Stoica, "Faircloud: Sharing the network in cloud computing," HotNets-X, (New York, NY, USA), pp. 22:1--22:6, ACM, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. M. Alizadeh, S. Yang, M. Sharif, S. Katti, N. McKeown, B. Prabhakar, and S. Shenker, "pfabric: Minimal near-optimal datacenter transport," SIGCOMM 2013, (New York, NY, USA), pp. 435--446, ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. M. Chowdhury, Y. Zhong, and I. Stoica, "Efficient coflow scheduling with varys," in Proceedings of the 2014 ACM Conference on SIGCOMM, SIGCOMM '14, (New York, NY, USA), pp. 443--454, ACM, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. M. Al-Fares, S. Radhakrishnan, B. Raghavan, N. Huang, and A. Vahdat, "Hedera: Dynamic flow scheduling for data center networks," NSDI 2010, (Berkeley, CA, USA), pp. 19--19, USENIX Association. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. T. Benson, A. Anand, A. Akella, and M. Zhang, "Microte: Fine grained traffic engineering for data centers," CoNEXT 2011, pp. 8:1--8:12, ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. J. Cao, R. Xia, P. Yang, C. Guo, G. Lu, L. Yuan, Y. Zheng, H. Wu, Y. Xiong, and D. Maltz, "Per-packet load-balanced, low-latency routing for clos-based data center networks," CoNEXT 2013, pp. 49--60, ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. S. Kandula, D. Katabi, S. Sinha, and A. Berger, "Dynamic load balancing without packet reordering," SIGCOMM Comput. Commun. Rev., vol. 37, pp. 51--62, Mar. 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. S. Sen, D. Shue, S. Ihm, and M. J. Freedman, "Scalable, optimal flow routing in datacenters via local link balancing," CoNEXT 2013, pp. 151--162, ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. M. Alizadeh, T. Edsall, S. Dharmapurikar, R. Vaidyanathan, K. Chu, A. Fingerhut, V. T. Lam, F. Matus, R. Pan, N. Yadav, and G. Varghese, "Conga: Distributed congestion-aware load balancing for datacenters," SIGCOMM Comput. Commun. Rev., vol. 44, pp. 503--514, Aug. 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. C.-Y. Hong, S. Kandula, R. Mahajan, M. Zhang, V. Gill, M. Nanduri, and R. Wattenhofer, "Achieving high utilization with software-driven wan," SIGCOMM 2013, pp. 15--26, ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. S. Jain, A. Kumar, S. Mandal, J. Ong, L. Poutievski, A. Singh, S. Venkata, J. Wanderer, J. Zhou, M. Zhu, J. Zolla, U. Hölzle, S. Stuart, and A. Vahdat, "B4: Experience with a globally-deployed software defined wan," SIGCOMM 2013, pp. 3--14, ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. P. Bosshart, G. Gibb, H.-S. Kim, G. Varghese, N. McKeown, M. Izzard, F. Mujica, and M. Horowitz, "Forwarding Metamorphosis: Fast Programmable Match-action Processing in Hardware for SDN," in SIGCOMM, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. "Intel FlexPipe." http://www.intel.com/content/dam/www/public/us/en/documents/product-briefs/ethernet-switch-fm6000-series-brief.pdf.Google ScholarGoogle Scholar
  18. "Cavium and XPliant introduce a fully programmable switch silicon family scaling to 3.2 terabits per second." http://tinyurl.com/nzbqtr3.Google ScholarGoogle Scholar
  19. P. Bosshart, D. Daly, G. Gibb, M. Izzard, N. McKeown, J. Rexford, C. Schlesinger, D. Talayco, A. Vahdat, G. Varghese, and D. Walker, "P4: Programming protocol-independent packet processors," SIGCOMM Comput. Commun. Rev., vol. 44, pp. 87--95, July 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. T. Issariyakul and E. Hossain, Introduction to Network Simulator NS2. Springer Publishing Company, Incorporated, 1st ed., 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. R. Niranjan Mysore, A. Pamboris, N. Farrington, N. Huang, P. Miri, S. Radhakrishnan, V. Subramanya, and A. Vahdat, "Portland: A scalable fault-tolerant layer 2 data center network fabric," SIGCOMM 2009, pp. 39--50, ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. "Cisco's massively scalable data center." http://www.cisco.com/c/dam/en/us/td/docs/solutions/Enterprise/Data_Center/MSDC/1-0/MSDC_AAG_1.pdf, Sept 2015.Google ScholarGoogle Scholar
  23. "High Capacity StrataXGS®Trident II Ethernet Switch Series." http://www.broadcom.com/products/Switching/Data-Center/BCM56850-Series.Google ScholarGoogle Scholar
  24. S. Hu, K. Chen, H. Wu, W. Bai, C. Lan, H. Wang, H. Zhao, and C. Guo, "Explicit path control in commodity data centers: Design and applications," NSDI 2015, pp. 15--28, USENIX Association. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. A. Greenberg, J. R. Hamilton, N. Jain, S. Kandula, C. Kim, P. Lahiri, D. A. Maltz, P. Patel, and S. Sengupta, "Vl2: A scalable and flexible data center network," SIGCOMM Comput. Commun. Rev., vol. 39, pp. 51--62, Aug. 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. C. Guo, G. Lu, D. Li, H. Wu, X. Zhang, Y. Shi, C. Tian, Y. Zhang, and S. Lu, "Bcube: A high performance, server-centric network architecture for modular data centers," SIGCOMM 2009, pp. 63--74, ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. E. Athanasopoulou, L. X. Bui, T. Ji, R. Srikant, and A. Stolyar, "Back-pressure-based packet-by-packet adaptive routing in communication networks," IEEE/ACM Trans. Netw., vol. 21, pp. 244--257, Feb. 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. B. Awerbuch and T. Leighton, "A simple local-control approximation algorithm for multicommodity flow," pp. 459--468, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. "P4 Specification." http://p4.org/wp-content/uploads/2015/11/p4-v1.1rc-Nov-17.pdf.Google ScholarGoogle Scholar
  30. S. Radhakrishnan, M. Tewari, R. Kapoor, G. Porter, and A. Vahdat, "Dahu: Commodity switches for direct connect data center networks," ANCS 2013, pp. 59--70, IEEE Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. A. Sivaraman, M. Budiu, A. Cheung, C. Kim, S. Licking, G. Varghese, H. Balakrishnan, M. Alizadeh, and N. McKeown, "Packet transactions: A programming model for data-plane algorithms at hardware speed," CoRR, vol. abs/1512.05023, 2015.Google ScholarGoogle Scholar
  32. "Protocol-independent switch architecture." http://schd.ws/hosted_files/p4workshop2015/c9/NickM-P4-Workshop-June-04-2015.pdf.Google ScholarGoogle Scholar
  33. "Members of the p4 consortium." http://p4.org/join-us/.Google ScholarGoogle Scholar
  34. "P4's action-execution semantics and conditional operators." https://github.com/anirudhSK/p4-semantics/raw/master/p4-semantics.pdf.Google ScholarGoogle Scholar
  35. Private communication with the authors of CONGA.Google ScholarGoogle Scholar
  36. M. Alizadeh, A. Greenberg, D. A. Maltz, J. Padhye, P. Patel, B. Prabhakar, S. Sengupta, and M. Sridharan, "Data center tcp (dctcp)," SIGCOMM 2010, pp. 63--74, ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. K. He, E. Rozner, K. Agarwal, W. Felter, J. Carter, and A. Akella, "Presto: Edge-based load balancing for fast datacenter networks," in SIGCOMM, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. C. Raiciu, S. Barre, C. Pluntke, A. Greenhalgh, D. Wischik, and M. Handley, "Improving datacenter performance and robustness with multipath tcp," SIGCOMM 2011, pp. 266--277, ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. W. Bai, L. Chen, K. Chen, D. Han, C. Tian, and H. Wang, "Information-agnostic flow scheduling for commodity data centers," NSDI 2015, pp. 455--468, USENIX Association. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. D. Zats, T. Das, P. Mohan, D. Borthakur, and R. Katz, "Detail: Reducing the flow completion time tail in datacenter networks," SIGCOMM 2012, pp. 139--150, ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. S. Kandula, D. Katabi, B. Davie, and A. Charny, "Walking the tightrope: Responsive yet stable traffic engineering," SIGCOMM 2005, pp. 253--264, ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. A. Elwalid, C. Jin, S. Low, and I. Widjaja, "Mate: Mpls adaptive traffic engineering," in IEEE INFOCOM 2001, pp. 1300--1309 vol. 3.Google ScholarGoogle Scholar
  43. N. Mchael and A. Tang, "Halo: Hop-by-hop adaptive link-state optimal routing," Networking, IEEE/ACM Transactions on, vol. PP, no. 99, pp. 1--1, 2014.Google ScholarGoogle Scholar
  44. R. Gallager, "A minimum delay routing algorithm using distributed computation," Communications, IEEE Transactions on, vol. 25, pp. 73--85, Jan 1977.Google ScholarGoogle ScholarCross RefCross Ref
  1. HULA: Scalable Load Balancing Using Programmable Data Planes

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      SOSR '16: Proceedings of the Symposium on SDN Research
      March 2016
      178 pages
      ISBN:9781450342117
      DOI:10.1145/2890955

      Copyright © 2016 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 14 March 2016

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed limited

      Acceptance Rates

      Overall Acceptance Rate7of43submissions,16%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader