skip to main content
research-article
Open Access

TIMELY: RTT-based Congestion Control for the Datacenter

Published:17 August 2015Publication History
Skip Abstract Section

Abstract

Datacenter transports aim to deliver low latency messaging together with high throughput. We show that simple packet delay, measured as round-trip times at hosts, is an effective congestion signal without the need for switch feedback. First, we show that advances in NIC hardware have made RTT measurement possible with microsecond accuracy, and that these RTTs are sufficient to estimate switch queueing. Then we describe how TIMELY can adjust transmission rates using RTT gradients to keep packet latency low while delivering high bandwidth. We implement our design in host software running over NICs with OS-bypass capabilities. We show using experiments with up to hundreds of machines on a Clos network topology that it provides excellent performance: turning on TIMELY for OS-bypass messaging over a fabric with PFC lowers 99 percentile tail latency by 9X while maintaining near line-rate throughput. Our system also outperforms DCTCP running in an optimized kernel, reducing tail latency by $13$X. To the best of our knowledge, TIMELY is the first delay-based congestion control protocol for use in the datacenter, and it achieves its results despite having an order of magnitude fewer RTT signals (due to NIC offload) than earlier delay-based schemes such as Vegas.

Skip Supplemental Material Section

Supplemental Material

p537-mittal.webm

webm

149.7 MB

References

  1. Chelsio T5 Packet Rate Performance Report. http://goo.gl/3jJL6p, Pg 2.Google ScholarGoogle Scholar
  2. Data Center Bridging Task Group. http://www.ieee802.org/1/pages/dcbridges.html.Google ScholarGoogle Scholar
  3. Dual Port 10 Gigabit Server Adapter with Precision Time Stamping. http://goo.gl/VtL5oO.Google ScholarGoogle Scholar
  4. Gnuplot documentation. http://goo.gl/4sgrUU, Pg. 48.Google ScholarGoogle Scholar
  5. Mellanox for Linux. http://goo.gl/u44Xea.Google ScholarGoogle Scholar
  6. The NetFPGA Project. http://netfpga.org/.Google ScholarGoogle Scholar
  7. TSO Sizing and the FQ Scheduler. http://lwn.net/Articles/564978/.Google ScholarGoogle Scholar
  8. Using Hardware Timestamps with PF RING. http://goo.gl/oJtHCe, 2011.Google ScholarGoogle Scholar
  9. Who (Really) Needs Sub-microsecond Packet Timestamps? http://goo.gl/TI3r1u, 2013.Google ScholarGoogle Scholar
  10. A. Kabbani et al. AF-QCN: Approximate Fairness with Quantized Congestion Notification for Multi tenanted Data Centers. In Hot Interconnects'10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. A. Kabbani et al. FlowBender: Flow-level Adaptive Routing for Improved Latency and Throughput in Datacenter Networks. In ACM CoNEXT '14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. A. Singh et al. Jupiter Rising: A Decade of Clos Topologies and Centralized Control in Google's Datacenter. In SIGCOMM'15. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Alizadeh et al. CONGA: Distributed Congestion aware Load Balancing for Datacenters. In SIGCOMM '14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. B. Stephens et al. Practical DCB for improved data center networks. In Infocom 2014.Google ScholarGoogle ScholarCross RefCross Ref
  15. B. Vamanan et al. Deadline-aware datacenter TCP (D2TCP). In SIGCOMM '12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Brakmo et al. TCP Vegas: new techniques for congestion detection and avoidance. In SIGCOMM '94. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. C. Lee et al. Accurate Latency-based Congestion Feedback for Datacenters. In USENIX ATC 15. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. C.-Y. Hong et al. Finishing Flows Quickly with Preemptive Scheduling. In SIGCOMM '12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. D.-M. Chiu and R. Jain. Analysis of the Increase and Decrease Algorithms for Congestion Avoidance in Computer Networks. Comput. Netw. ISDN Syst., 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. D. Zats et al. DeTail: Reducing the Flow Completion Time Tail in Datacenter Networks. In SIGCOMM '12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. J. Dean and L. A. Barroso. The Tail at Scale. Communications of the ACM, 56:74--80, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. S. Floyd. TCP and explicit congestion notification. ACM SIGCOMM CCR, 24(5), 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. S. Floyd and V. Jacobson. Random early detection gateways for congestion avoidance. IEEE/ACM Trans. Netw., 1, August 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. I. Grigorik. Optimizing the Critical Rendering Path. http://goo.gl/DvFfGo, Velocity Conference 2013.Google ScholarGoogle Scholar
  25. D. A. Hayes and G. Armitage. Revisiting TCP Congestion Control using Delay Gradients. In Networking IFIP, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. D. A. Hayes and D. Ros. Delay-based Congestion Control for Low Latency.Google ScholarGoogle Scholar
  27. C. Hollot, V. Misra, D. Towsley, and W.-B. Gong. A control theoretic analysis of RED. In IEEE Infocom '01.Google ScholarGoogle Scholar
  28. C. Hollot, V. Misra, D. Towsley, and W.-B. Gong. On designing improved controllers for AQM routers supporting TCP flows. In IEEE Infocom '01.Google ScholarGoogle Scholar
  29. IEEE. 802.1Qau - Congestion Notification. http://www.ieee802.org/1/pages/802.1au.html.Google ScholarGoogle Scholar
  30. J. Perry et al. Fastpass: A Centralized "Zero-Queue" Datacenter Network. In SIGCOMM '14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. R. Jain, D. Chiu, and W. Hawe. A Quantitative Measure of Fairness and Discrimination for Resource Allocation in Shared Computer Systems. In DEC Research Report TR-301, 1984.Google ScholarGoogle Scholar
  32. D. Katabi, M. Handley, and C. Rohrs. Internet Congestion Control for Future High Bandwidth-Delay Product Environments. In SIGCOMM'02. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. F. P. Kelly, G. Raina, and T. Voice. Stability and fairness of explicit congestion control with small buffers. Computer Communication Review, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. M. Al-Fares et al. A Scalable, Commodity Data Center Network Architecture. SIGCOMM '08. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. M. Alizadeh et al. Data center TCP (DCTCP). In SIGCOMM '10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. M. Alizadeh et al. Data Center Transport Mechanisms: Congestion Control Theory and IEEE Standardization. In Annual Allerton Conference '08.Google ScholarGoogle Scholar
  37. M. Alizadeh et al. Less Is More: Trading a Little Bandwidth for Ultra-Low Latency in the Data Center. In NSDI '12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. M. Alizadeh et al. Deconstructing datacenter packet transport. In ACM HotNets, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. N. Dukkipati et al. Processor Sharing Flows in the Internet. In IWQoS, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. K. Nichols and V. Jacobson. Controlling queue delay. Queue, 10(5):20:20--20:34, May 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. J. Postel. Transmission Control Protocol. RFC 793, 1981. Updated by RFCs 1122, 3168, 6093, 6528.Google ScholarGoogle Scholar
  42. S. Ha et al. CUBIC: A New TCP-Friendly High-Speed TCP Variant. SIGOPS Operating System Review '08. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. S. Radhakrishnan et al. SENIC: scalable NIC for end-host rate limiting. In NSDI 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. K. Tan and J. Song. A compound TCP approach for high-speed and long distance networks. In IEEE INFOCOM '06.Google ScholarGoogle Scholar
  45. V. Vasudevan et al. Safe and effective fine-grained TCP retransmissions for datacenter communication. In SIGCOMM '09. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. D. X. Wei, C. Jin, S. H. Low, and S. Hegde. FAST TCP: Motivation, Architecture, Algorithms, Performance. IEEE/ACM Trans. Netw., 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. C. Wilson, H. Ballani, T. Karagiannis, and A. Rowtron. Better never than late: meeting deadlines in datacenter networks. In SIGCOMM '11. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Y. Zhu et al. Congestion Control for Large-Scale RDMA Deployments. In SIGCOMM 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. TIMELY: RTT-based Congestion Control for the Datacenter

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM SIGCOMM Computer Communication Review
      ACM SIGCOMM Computer Communication Review  Volume 45, Issue 4
      SIGCOMM'15
      October 2015
      659 pages
      ISSN:0146-4833
      DOI:10.1145/2829988
      Issue’s Table of Contents
      • cover image ACM Conferences
        SIGCOMM '15: Proceedings of the 2015 ACM Conference on Special Interest Group on Data Communication
        August 2015
        684 pages
        ISBN:9781450335423
        DOI:10.1145/2785956

      Copyright © 2015 Owner/Author

      Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 17 August 2015

      Check for updates

      Qualifiers

      • research-article

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader