skip to main content
10.1145/3098822.3098829acmconferencesArticle/Chapter ViewAbstractPublication PagescommConference Proceedingsconference-collections
research-article
Public Access

Language-Directed Hardware Design for Network Performance Monitoring

Published:07 August 2017Publication History

ABSTRACT

Network performance monitoring today is restricted by existing switch support for measurement, forcing operators to rely heavily on endpoints with poor visibility into the network core. Switch vendors have added progressively more monitoring features to switches, but the current trajectory of adding specific features is unsustainable given the ever-changing demands of network operators. Instead, we ask what switch hardware primitives are required to support an expressive language of network performance questions. We believe that the resulting switch hardware design could address a wide variety of current and future performance monitoring needs.

We present a performance query language, Marple, modeled on familiar functional constructs like map, filter, groupby, and zip. Marple is backed by a new programmable key-value store primitive on switch hardware. The key-value store performs flexible aggregations at line rate (e.g., a moving average of queueing latencies per flow), and scales to millions of keys. We present a Marple compiler that targets a P4-programmable software switch and a simulator for high-speed programmable switches. Marple can express switch queries that could previously run only on end hosts, while Marple queries only occupy a modest fraction of a switch's hardware resources.

Skip Supplemental Material Section

Supplemental Material

languagedirectedhardwaredesignfornetworkperformancemonitoring.webm

webm

138.9 MB

References

  1. 45 nanometer - Wikipedia, Technology demos. https://en.wikipedia.org/wiki/45_nanometer#Technology_demos.Google ScholarGoogle Scholar
  2. An Update on the Memcached/Redis Benchmark. http://oldblog.antirez.com/post/update-on-memcached-redis-benchmark.html.Google ScholarGoogle Scholar
  3. Barefoot: The World's Fastest and Most Programmable Networks. https://barefootnetworks.com/media/white_papers/Barefoot-Worlds-Fastest-Most-Programmable-Networks.pdf.Google ScholarGoogle Scholar
  4. Benchmarking Apache Kafka: 2 Million Writes Per Second (On Three Cheap Machines). https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines.Google ScholarGoogle Scholar
  5. Broadcom First to Deliver 64 Ports of 100GE with Tomahawk II 6.4Tbps Ethernet Switch. https://www.broadcom.com/news/product-releases/broadcom-first-to-deliver-64-ports-of-100ge-with-tomahawk-ii-ethernet-switch.Google ScholarGoogle Scholar
  6. Cavium XPliant Switches and Microsoft Azure Networking Achieve SAI Routing Interoperability. http://www.cavium.com/newsevents-Cavium-XPliant-Switches-and-Microsoft-Azure-Networking-Achieve-SAI-Routing-Interoperability.html.Google ScholarGoogle Scholar
  7. Cisco IOS NetFlow. http://www.cisco.com/c/en/us/products/ios-nx-os-software/ios-netflow/index.html.Google ScholarGoogle Scholar
  8. Configuring SPAN. http://www.cisco.com/c/en/us/td/docs/switches/lan/catalyst2940/software/release/12-1_19_ea1/configuration/guide/2940scg_1/swspan.html.Google ScholarGoogle Scholar
  9. Data center flow telemetry. http://www.cisco.com/c/en/us/products/collateral/data-center-analytics/tetration-analytics/white-paper-c11-737366.html.Google ScholarGoogle Scholar
  10. Gigamon. https://www.gigamon.com/products/visibility-nodes/visibility-appliances.html.Google ScholarGoogle Scholar
  11. How Fast is Redis? http://redis.io/topics/benchmarks.Google ScholarGoogle Scholar
  12. In-band Network Telemetry. https://github.com/p4lang/p4factory/tree/master/apps/int.Google ScholarGoogle Scholar
  13. Intel FlexPipe. http://www.intel.com/content/dam/www/public/us/en/documents/product-briefs/ethernet-switch-fm6000-series-brief.pdf.Google ScholarGoogle Scholar
  14. Intel64 and IA-32 Architectures Optimization Reference Manual. http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-optimization-manual.pdf.Google ScholarGoogle Scholar
  15. Marple proofs. http://web.mit.edu/marple/marple_tr.pdf.Google ScholarGoogle Scholar
  16. Microsoft bets big on SDN. https://azure.microsoft.com/en-us/blog/microsoft-showcases-software-defined-networking-innovation-at-sigcomm-v2/.Google ScholarGoogle Scholar
  17. Multiply-accumulate operation. https://en.wikipedia.org/wiki/Multiply-accumulate_operation.Google ScholarGoogle Scholar
  18. P4-16 Language Specification. http://p4.org/wp-content/uploads/2016/12/P4_16-prerelease-Dec_16.html.Google ScholarGoogle Scholar
  19. P4 Behavioral Model. https://github.com/p4lang/behavioral-model.Google ScholarGoogle Scholar
  20. Redis. http://redis.io/.Google ScholarGoogle Scholar
  21. sFlow. https://en.wikipedia.org/wiki/SFlow.Google ScholarGoogle Scholar
  22. SRAM - ARM. https://www.arm.com/products/physical-ip/embedded-memory-ip/sram.php.Google ScholarGoogle Scholar
  23. The CAIDA UCSD Anonymized Internet Traces 2014 - June. http://www.caida.org/data/passive/passive_2014_dataset.xml.Google ScholarGoogle Scholar
  24. The CAIDA UCSD Anonymized Internet Traces 2016 - April. http://www.caida.org/data/passive/passive_2016_dataset.xml.Google ScholarGoogle Scholar
  25. XPliant™ Ethernet Switch Product Family. http://www.cavium.com/XPliant-Ethernet-Switch-Product-Family.html.Google ScholarGoogle Scholar
  26. The Future of Network Monitoring with Barefoot Networks. https://youtu.be/Gbm7kDHXR-o, 2017.Google ScholarGoogle Scholar
  27. M. Alizadeh, T. Edsall, S. Dharmapurikar, R. Vaidyanathan, K. Chu, A. Fingerhut, V. T. Lam, F. Matus, R. Pan, N. Yadav, and G. Varghese. CONGA: Distributed Congestion-Aware Load Balancing for Datacenters. In SIGCOMM, 2014.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Alizadeh, Mohammad. Empirical Traffic Generator. https://github.com/datacenter/empirical-traffic-gen, 2017.Google ScholarGoogle Scholar
  29. J. R. Allen, K. Kennedy, C. Porterfield, and J. Warren. Conversion of Control Dependence to Data Dependence. In POPL, 1983. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. A. Arasu, S. Babu, and J. Widom. The CQL Continuous Query Language: Semantic Foundations and Query Execution. The VLDB Journal, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. B. Arzani, S. Ciraci, B. T. Loo, A. Schuster, and G. Outhred. Taking the Blame Game out of Data Centers Operations with NetPoirot. In Proceedings of the 2016 Conference on ACM SIGCOMM 2016 Conference, SIGCOMM '16, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. T. Benson, A. Akella, and D. A. Maltz. Network Traffic Characteristics of Data Centers in the Wild. ACM International Measurement Conference, Nov. 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. P. Bosshart, G. Gibb, H.-S. Kim, G. Varghese, N. McKeown, M. Izzard, F. Mujica, and M. Horowitz. Forwarding Metamorphosis: Fast Programmable Match-Action Processing in Hardware for SDN. In SIGCOMM, 2013.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. G. Cormode and S. Muthukrishnan. An Improved Data Stream Summary: The Count-Min Sketch and Its Applications. Journal of Algorithms, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. C. Cranor, T. Johnson, O. Spataschek, and V. Shkapenyuk. Gigascope: A Stream Database for Network Applications. In SIGMOD, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. D. R. Ditzel and D. A. Patterson. Retrospective on High-level Language Computer Architecture. In ISCA, 1980. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. M. Dobrescu, K. Argyraki, and S. Ratnasamy. Toward Predictable Performance in Software Packet-processing Platforms. In NSDI, 2012.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. N. Foster, R. Harrison, M. J. Freedman, C. Monsanto, J. Rexford, A. Story, and D. Walker. Frenetic: A Network Programming Language. In ICFP, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. G. Gibb, G. Varghese, M. Horowitz, and N. McKeown. Design Principles for Packet Parsers. In ANCS, 2013. Google ScholarGoogle ScholarCross RefCross Ref
  40. C. Guo, L. Yuan, D. Xiang, Y. Dang, R. Huang, D. Maltz, Z. Liu, V. Wang, B. Pang, H. Chen, Z.-W. Lin, and V. Kurien. Pingmesh: A Large-Scale System for Data Center Network Latency Measurement and Analysis. In SIGCOMM, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. A. Gupta, R. Birkner, M. Canini, N. Feamster, C. Mac-Stoker, and W. Willinger. Network Monitoring is a Streaming Analytics Problem. In HOTNETS, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. N. Handigol, B. Heller, V. Jeyakumar, D. Mazières, and N. McKeown. I Know What Your Packet Did Last Hop: Using Packet Histories to Troubleshoot Networks. In NSDI, 2014.Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. S. Hart, E. Frachtenberg, and M. Berezecki. Predicting Memcached Throughput Using Simulation and Modeling. In Symposium on Theory of Modeling and Simulation, 2012.Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. V. Jeyakumar, M. Alizadeh, Y. Geng, C. Kim, and D. Mazières. Millions of Little Minions: Using Packets for Low Latency Network Programming and Visibility. In SIGCOMM, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. S. P. Jones and P. Wadler. Comprehensive Comprehensions. In Proceedings of the ACM SIGPLAN Workshop on Haskell Workshop, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. M. Kumar and K. Prasad. Auto-learning of MAC addresses and lexicographic lookup of hardware database. US Patent App. 10/747,332.Google ScholarGoogle Scholar
  47. B. Lantz, B. Heller, and N. McKeown. A Network in a Laptop: Rapid Prototyping for Software-defined Networks. In HotNets, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Y. Li, R. Miao, C. Kim, and M. Yu. FlowRadar: A Better NetFlow for Data Centers. In NSDI, 2016.Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Z. Liu, A. Manousis, G. Vorsanger, V. Sekar, and V. Braverman. One Sketch to Rule Them All: Rethinking Network Flow Monitoring with UnivMon. In SIGCOMM, 2016.Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. W. M. McKeeman. Language directed computer design. In Proceedings of the November 14-16, 1967, fall joint computer conference, 1967. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. N. McKeown, T. Anderson, H. Balakrishnan, G. Parulkar, L. Peterson, J. Rexford, S. Shenker, and J. Turner. OpenFlow: Enabling Innovation in Campus Networks. SIGCOMM Comput. Commun. Rev., 38(2):69--74, Mar. 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. M. Moshref, M. Yu, R. Govindan, and A. Vahdat. DREAM: Dynamic Resource Allocation for Software-defined Measurement. In SIGCOMM, 2014.Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. M. Moshref, M. Yu, R. Govindan, and A. Vahdat. Trumpet: Timely and Precise Triggers in Data Centers. In SIGCOMM, 2016.Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. S. Narayana, M. Tahmasbi, J. Rexford, and D. Walker. Compiling Path Queries. In NSDI, 2016.Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. P. Phaal. SFlow sampling rates, 2016. http://blog.sflow.com/2009/06/sampling-rates.html.Google ScholarGoogle Scholar
  56. A. Sivaraman, A. Cheung, M. Budiu, C. Kim, M. Alizadeh, H. Balakrishnan, G. Varghese, N. McKeown, and S. Licking. Packet Transactions: High-Level Programming for Line-Rate Switches. In SIGCOMM, 2016.Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. A. Sivaraman, S. Subramanian, M. Alizadeh, S. Chole, S.-T. Chuang, A. Agrawal, H. Balakrishnan, T. Edsall, S. Katti, and N. McKeown. Programmable Packet Scheduling at Line Rate. In SIGCOMM, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. P. Tammana, R. Agarwal, and M. Lee. Simplifying Datacenter Network Debugging with PathDump. In OSDI, 2016.Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. D. Ungar, R. Blau, P. Foley, D. Samples, and D. Patterson. Architecture of SOAR: Smalltalk on a RISC. In ISCA, 1984.Google ScholarGoogle Scholar
  60. E. Vanini, R. Pan, M. Alizadeh, P. Taheri, and T. Edsall. Let it Flow: Resilient Asymmetric Load Balancing with Flowlet Switching. In NSDI, 2017.Google ScholarGoogle Scholar
  61. V. Vasudevan, A. Phanishayee, H. Shah, E. Krevat, D. G. Andersen, G. R. Ganger, G. A. Gibson, and B. Mueller. Safe and Effective Fine-Grained TCP Retransmissions for Datacenter Communication. In SIGCOMM, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. M. Yu, A. Greenberg, D. Maltz, J. Rexford, L. Yuan, S. Kandula, and C. Kim. Profiling Network Performance for Multi-tier Data Center Applications. In NSDI, 2011.Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. M. Yu, L. Jose, and R. Miao. Software Defined Traffic Measurement with OpenSketch. In NSDI, 2013.Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. M. Zaharia, T. Das, H. Li, T. Hunter, S. Shenker, and I. Stoica. Discretized Streams: Fault-tolerant Streaming Computation at Scale. In SOSP, 2013.Google ScholarGoogle ScholarDigital LibraryDigital Library
  65. Y. Zhu, N. Kang, J. Cao, A. Greenberg, G. Lu, R. Mahajan, D. Maltz, L. Yuan, M. Zhang, B. Y. Zhao, and H. Zheng. Packet-Level Telemetry in Large Datacenter Networks. In SIGCOMM, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Language-Directed Hardware Design for Network Performance Monitoring

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        SIGCOMM '17: Proceedings of the Conference of the ACM Special Interest Group on Data Communication
        August 2017
        515 pages
        ISBN:9781450346535
        DOI:10.1145/3098822

        Copyright © 2017 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 7 August 2017

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed limited

        Acceptance Rates

        Overall Acceptance Rate554of3,547submissions,16%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader