skip to main content
10.1145/3098822.3098852acmconferencesArticle/Chapter ViewAbstractPublication PagescommConference Proceedingsconference-collections
research-article
Open Access

Carousel: Scalable Traffic Shaping at End Hosts

Published:07 August 2017Publication History

ABSTRACT

Traffic shaping, including pacing and rate limiting, is fundamental to the correct and efficient operation of both datacenter and wide area networks. Sample use cases include policy-based bandwidth allocation to flow aggregates, rate-based congestion control algorithms, and packet pacing to avoid bursty transmissions that can overwhelm router buffers. Driven by the need to scale to millions of flows and to apply complex policies, traffic shaping is moving from network switches into the end hosts, typically implemented in software in the kernel networking stack.

In this paper, we show that the performance overhead of end-host traffic shaping is substantial limits overall system scalability as we move to thousands of individual traffic classes per server. Measurements from production servers show that shaping at hosts consumes considerable CPU and memory, unnecessarily drops packets, suffers from head of line blocking and inaccuracy, and does not provide backpressure up the stack. We present Carousel, a framework that scales to tens of thousands of policies and flows per server, built from the synthesis of three key ideas: i) a single queue shaper using time as the basis for releasing packets, ii) fine-grained, just-in-time freeing of resources in higher layers coupled to actual packet departures, and iii) one shaper per CPU core, with lock-free coordination. Our production experience in serving video traffic at a Cloud service provider shows that Carousel shapes traffic accurately while improving overall machine CPU utilization by 8% (an improvement of 20% in the CPU utilization attributed to networking) relative to state-of-art deployments. It also conforms 10 times more accurately to target rates, and consumes two orders of magnitude less memory than existing approaches.

Skip Supplemental Material Section

Supplemental Material

carouselscalabletrafficshapingatendhosts.webm

webm

69.4 MB

References

  1. 2002. pfifo-tc: PFIFO Qdisc. (2002). https://linux.die.net/man/8/tc-pfifo/.Google ScholarGoogle Scholar
  2. 2003. Linux Hierarchical Token Bucket. (2003). http://luxik.cdi.cz/devik/qos/htb/.Google ScholarGoogle Scholar
  3. 2008. net: Generic receive offload. (2008). https://lwn.net/Articles/358910/.Google ScholarGoogle Scholar
  4. 2012. tcp: TCP Small Queues. (2012). https://lwn.net/Articles/506237/.Google ScholarGoogle Scholar
  5. 2013. pkt_sched: fq: Fair Queue packet scheduler. (2013). https://lwn.net/Articles/564825/.Google ScholarGoogle Scholar
  6. 2013. tcp: TSO packets automatic sizing. (2013). https://lwn.net/Articles/564979/.Google ScholarGoogle Scholar
  7. 2016. neper: a Linux networking performance tool. (2016). https://github.com/google/neper.Google ScholarGoogle Scholar
  8. 2017. Leaky bucket as a queue. (2017). https://en.wikipedia.org/wiki/Leaky_bucket.Google ScholarGoogle Scholar
  9. Mohammad Alizadeh, Abdul Kabbani, Tom Edsall, Balaji Prabhakar, Amin Vahdat, and Masato Yasuda. 2012. Less Is More: Trading a Little Bandwidth for Ultra-Low Latency in the Data Center. In NSDI '12. 253--266.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. W. Almesberger, J. H. Salim, and A. Kuznetsov. 1999. Differentiated services on Linux. In GLOBECOM '99. 831--836 vol. 1b. Google ScholarGoogle ScholarCross RefCross Ref
  11. Michael Armbrust, Armando Fox, Rean Griffith, Anthony D. Joseph, Randy Katz, Andy Konwinski, Gunho Lee, David Patterson, Ariel Rabkin, Ion Stoica, and Matei Zaharia. 2010. A View of Cloud Computing. Commun. ACM 53, 4 (2010), 50--58. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Wei Bai, Kai Chen, Haitao Wu, Wuwei Lan, and Yangming Zhao. 2014. PAC: taming TCP Incast congestion using proactive ACK control. In ICNP '14. 385--396. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Hitesh Ballani, Paolo Costa, Thomas Karagiannis, and Ant Rowstron. 2011. Towards Predictable Datacenter Networks. SIGCOMM Comput. Commun. Rev. 41, 4 (Aug. 2011), 242--253. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Neda Beheshti, Yashar Ganjali, Monia Ghobadi, Nick McKeown, and Geoff Salmon. 2008. Experimental Study of Router Buffer Sizing. In IMC '08. 197--210. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Jesper Dangaard Brouer. 2015. Network stack challenges at increasing speeds: The 100Gbit/s challenge. LinuxCon North America. (2015). http://events.linuxfoundation.org/sites/events/files/slides/net_stack_challenges_100G_1.pdfGoogle ScholarGoogle Scholar
  16. Randy Brown. 1988. Calendar queues: a fast O(1) priority queue implementation for the simulation event set problem. Commun. ACM 31, 10 (1988), 1220--1227. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Neal Cardwell, Yuchung Cheng, C. Stephen Gunn, Van Jacobson, and Soheil Yeganeh. 2016. BBR: Congestion-Based Congestion Control. ACM Queue (2016), 50:20--50:53.Google ScholarGoogle Scholar
  18. Yuchung Cheng and Neal Cardwell. 2016. Making Linux TCP Fast. In Netdev Conference.Google ScholarGoogle Scholar
  19. David D Clark. 1985. The structuring of systems using upcalls. In SOSP '85. 171--180.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Glenn William Connery, W Paul Sherer, Gary Jaszewski, and James S Binder. 1999. Offload of TCP segmentation to a smart adapter. US Patent 5,937,169. (August 1999).Google ScholarGoogle Scholar
  21. Tobias Flach, Pavlos Papageorge, Andreas Terzis, Luis Pedrosa, Yuchung Cheng, Tayeb Karim, Ethan Katz-Bassett, and Ramesh Govindan. 2016. An Internet-Wide Analysis of Traffic Policing. In SIGCOMM '16. 468--482. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Yashar Ganjali and Nick McKeown. 2006. Update on Buffer Sizing in Internet Routers. SIGCOMM Comput. Commun. Rev. (2006), 67--70. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Monia Ghobadi, Yuchung Cheng, Ankur Jain, and Matt Mathis. 2012. Trickle: Rate Limiting YouTube Video Streaming. In USENIX ATC '12. 191--196.Google ScholarGoogle Scholar
  24. Sangjin Han, Keon Jang, Aurojit Panda, Shoumik Palkar, Dongsu Han, and Sylvia Ratnasamy. 2015. SoftNIC: A Software NIC to Augment Hardware. Technical Report UCB/EECS-2015-155. EECS Department, University of California, Berkeley.Google ScholarGoogle Scholar
  25. Chi-Yao Hong, Srikanth Kandula, Ratul Mahajan, Ming Zhang, Vijay Gill, Mohan Nanduri, and Roger Wattenhofer. 2013. Achieving High Utilization with Software-driven WAN. SIGCOMM Comput. Commun. Rev. 43, 4 (2013), 15--26. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Intel Networking Division. 2016. 82599 10 GbE Controller Datasheet. (2016).Google ScholarGoogle Scholar
  27. Intel Networking Division. 2016. Ethernet Switch FM10000. (2016).Google ScholarGoogle Scholar
  28. Sushant Jain, Alok Kumar, Subhasree Mandal, Joon Ong, Leon Poutievski, Arjun Singh, Subbaiah Venkata, Jim Wanderer, Junlan Zhou, Min Zhu, Jon Zolla, Urs Hölzle, Stephen Stuart, and Amin Vahdat. 2013. B4: Experience with a Globally-deployed Software Defined Wan. In SIGCOMM '13. 3--14.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Keon Jang, Justine Sherry, Hitesh Ballani, and Toby Moncaster. 2015. Silo: Predictable Message Latency in the Cloud. In SIGCOMM '15. 435--448.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Vimalkumar Jeyakumar, Mohammad Alizadeh, David Mazières, Balaji Prabhakar, Albert Greenberg, and Changhoon Kim. 2013. EyeQ: Practical Network Performance Isolation at the Edge. In NSDI '13. 297--311.Google ScholarGoogle Scholar
  31. Antoine Kaufmann, SImon Peter, Naveen Kr. Sharma, Thomas Anderson, and Arvind Krishnamurthy. 2016. High Performance Packet Processing with FlexNIC. In ASPLOS '16. 67--81.Google ScholarGoogle Scholar
  32. Alok Kumar, Sushant Jain, Uday Naik, Anand Raghuraman, Nikhil Kasinadhuni, Enrique Cauich Zermeno, C. Stephen Gunn, Jing Ai, Björn Carlin, Mihai Amarandei-Stavila, Mathieu Robin, Aspi Siganporia, Stephen Stuart, and Amin Vahdat. 2015. BwE: Flexible, Hierarchical Bandwidth Allocation for WAN Distributed Computing. In SIGCOMM '15. 1--14.Google ScholarGoogle Scholar
  33. Gautam Kumar, Srikanth Kandula, Peter Bodik, and Ishai Menache. 2013. Virtualizing Traffic Shapers for Practical Resource Allocation. In 5th USENIX Workshop on Hot Topics in Cloud Computing.Google ScholarGoogle Scholar
  34. Adam Langley, Alistair Riddoch, Alyssa Wilk, Antonio Vicente, Charles Krasic, Dan Zhang, Fan Yang, Fedor Kouranov, Ian Swett, Janardhan Iyengar, Jeff Bailey, Jeremy Dorfman, Jim Roskind, Joanna Kulik, Patrik Westin, Raman Tenneti, Robbie Shade, Ryan Hamilton, Victor Vasiliev, Wan-Teh Chang, and Zhongyi Shi. 2017. The QUIC Transport Protocol: Design and Internet-Scale Deployment. In SIGCOMM '17.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Radhika Mittal, Rachit Agarwal, Sylvia Ratnasamy, and Scott Shenker. 2016. Universal Packet Scheduling. In NSDI '16. 501--521.Google ScholarGoogle Scholar
  36. Radhika Mittal, Vinh The Lam, Nandita Dukkipati, Emily Blem, Hassan Wassel, Monia Ghobadi, Amin Vahdat, Yaogong Wang, David Wetherall, and David Zats. 2015. TIMELY: RTT-based Congestion Control for the Datacenter. In SIGCOMM '15. 537--550.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Lucian Popa, Praveen Yalagandula, Sujata Banerjee, Jeffrey C. Mogul, Yoshio Turner, and Jose Renato Santos. 2013. ElasticSwitch: Practical Work-conserving Bandwidth Guarantees for Cloud Computing. SIGCOMM Comput. Commun. Rev. 43, 4 (Aug. 2013), 351--362. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Sivasankar Radhakrishnan, Yilong Geng, Vimalkumar Jeyakumar, Abdul Kabbani, George Porter, and Amin Vahdat. 2014. SENIC: Scalable NIC for End-Host Rate Limiting. In NSDI '14.Google ScholarGoogle Scholar
  39. Barath Raghavan, Kashi Vishwanath, Sriram Ramabhadran, Kenneth Yocum, and Alex C. Snoeren. 2007. Cloud Control with Distributed Rate Limiting. In SIGCOMM '07. 337--348. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Rusty Russell. 2008. virtio: towards a de-facto standard for virtual I/O devices. ACM SIGOPS Operating Systems Review 42, 5 (2008), 95--103. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. M. Shreedhar and George Varghese. 1995. Efficient Fair Queueing Using Deficit Round Robin. In SIGCOMM '95. 375--385.Google ScholarGoogle Scholar
  42. Anirudh Sivaraman, Suvinay Subramanian, Mohammad Alizadeh, Sharad Chole, Shang-Tse Chuang, Anurag Agrawal, Hari Balakrishnan, Tom Edsall, Sachin Katti, and Nick McKeown. 2016. Programmable Packet Scheduling at Line Rate. In SIGCOMM '16. 44--57. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. G. Varghese and T. Lauck. 1987. Hashed and Hierarchical Timing Wheels: Data Structures for the Efficient Implementation of a Timer Facility. In Proceedings of the Eleventh ACM Symposium on Operating Systems Principles (SOSP '87). 25--38. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. L. Zhang. 1990. Virtual Clock: A New Traffic Control Algorithm for Packet Switching Networks. In SIGCOMM '90. 19--29. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Carousel: Scalable Traffic Shaping at End Hosts

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      SIGCOMM '17: Proceedings of the Conference of the ACM Special Interest Group on Data Communication
      August 2017
      515 pages
      ISBN:9781450346535
      DOI:10.1145/3098822

      Copyright © 2017 Owner/Author

      This work is licensed under a Creative Commons Attribution-ShareAlike International 4.0 License.

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 7 August 2017

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed limited

      Acceptance Rates

      Overall Acceptance Rate554of3,547submissions,16%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader