skip to main content
10.1145/2110217.2110219acmconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
research-article

dFtree: a fat-tree routing algorithm using dynamic allocation of virtual lanes to alleviate congestion in infiniband networks

Published:14 November 2011Publication History

ABSTRACT

End-point hotspots can cause major slowdowns in interconnection networks due to head-of-line blocking and congestion. Therefore, avoiding congestion is important to ensure high performance for the network traffic. It is especially important in situations where permanent congestion, which results in permanent slowdown, can occur. Permanent congestion occurs when traffic has been moved away from a failed link, when multiple jobs run on the same system, and compete for network resources, or when a system is not balanced for the application that runs on it.

In this paper we suggest a mechanism for dynamic allocation of virtual lanes and live optimization of the distribution of flows between the allocated virtual lanes. The purpose is to alleviate the negative effect of permanent congestion by separating network flows into slow lane and fast lane traffic. Flows destined for a end-point hot-spot is placed in the slow lane and all other flows are placed in the fast lane. Consequently, the flows in the fast lane are unaffected by the head-of-line blocking created by the hot-spot traffic.

We demonstrate the feasibility of this approach using a modified version of OFED and OpenSM with fat-tree routing on a small InfiniBand cluster. Our experiments show an increase in throughput ranging from 150% to 468% compared to the conventional fat-tree algorithm in OFED.

References

  1. HPC Challenge Benchmark. http://icl.cs.utk.edu/hpcc/.Google ScholarGoogle Scholar
  2. The OpenFabrics Alliance. http://openfabrics.org/, Sept. 2010.Google ScholarGoogle Scholar
  3. Top 500 supercomputer sites. http://www.top500.org/, Nov. 2010.Google ScholarGoogle Scholar
  4. B. Bogdanski et al. Achieving Predictable High Performance in Imbalanced Fat Trees. In Proceedings of the 16th International Conference on Parallel and Distributed Systems (ICPADS'10) - to appear, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. W. J. Dally and B. Towles. Principles and practices of interconnection networks, chapter 15.4.1, pages 294--295. Morgan Kaufmann, 2004.Google ScholarGoogle Scholar
  6. J. Escudero-Sahuquillo et al. An Efficient Strategy for Reducing Head-of-Line Blocking in Fat-Trees. In D'Ambra, Pasqua And Guarracino, Mario And Talia, Domenico, editor, Lecture Notes in Computer Science, volume 6272, pages 413--427. Springer Berlin / Heidelberg, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. C. Gómez et al. Deterministic versus Adaptive Routing in Fat-Trees. In Proceedings of the IEEE International Parallel and Distributed Processing Symposium. IEEE CS, 2007.Google ScholarGoogle ScholarCross RefCross Ref
  8. E. G. Gran et al. First Experiences with Congestion Control in InfiniBand Hardware. In Proceeding of the 24th IEEE International Parallel & Distributed Processing Symposium, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  9. E. G. Gran and S.-A. Reinemo. Infiniband congestion control, modelling and validation. In 4th International ICST Conference on Simulation Tools and Techniques (SIMUTools2011, OMNeT ++ 2011 Workshop), 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. W. L. Guay, B. Bogdanski, S.-A. Reinemo, O. Lysne, and T. Skeie. vftree - a fat-tree routing algorithm using virtual lanes to alleviate congestion. In Proceedings of the 25th IEEE International Parallel & Distributed Processing Symposium, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. W. L. Guay and S.-A. Reinemo. A scalable method for signalling dynamic reconfiguration events with opensm. In R. Buyya, editor, 11th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing (CCGrid 2011), pages 332 -- 341. IEEE Computer Society Press, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. W. L. Guay, S.-A. Reinemo, O. Lysne, T. Skeie, B. D. Johnsen, and L. Holen. Host side dynamic reconfiguration with infiniband. In IEEE International Conference on Cluster Computing, pages 126--135, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. T. Hoefler et al. Multistage switches are not crossbars: Effects of static routing in high-performance networks. In Cluster Computing, 2008 IEEE International Conference on, pages 116--125, 2008.Google ScholarGoogle ScholarCross RefCross Ref
  14. InfiniBand Trade Association. InfiniBand architecture specification, 1.2.1 edition, November 2007.Google ScholarGoogle Scholar
  15. G. Pfister et al. Solving Hot Spot Contention Using InfiniBand Architecture Congestion Control, July 2005.Google ScholarGoogle Scholar
  16. G. F. Pfister and A. Norton. "Hot Spot" Contention and Combining in Multistage Interconnection Networks. IEEE Transactions on Computers, C-34(10):943--948, 1985.Google ScholarGoogle Scholar
  17. G. Rodriguez et al. Exploring pattern-aware routing in generalized fat tree networks. In Proceedings of the 23rd international conference on Supercomputing, pages 276--285, New York, 2009. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. G. Rodriguez et al. Oblivious Routing Schemes in Extended Generalized Fat Tree Networks. IEEE International Conference on Cluster Computing and Workshops, 2009. CLUSTER '09., pages 1--8, 2009.Google ScholarGoogle Scholar
  19. A. Vishnu, M. Koop, and A. Moody. Topology agnostic hot-spot avoidance with InfiniBand. Concurrency and Computation: Practice and Experience, 21(3):301--319, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. E. Zahavi et al. Optimized InfiniBand TM fat-tree routing for shift all-to-all communication patterns. Concurrency and Computation: Practice and Experience, 22(2):217--231, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. dFtree: a fat-tree routing algorithm using dynamic allocation of virtual lanes to alleviate congestion in infiniband networks

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      NDM '11: Proceedings of the first international workshop on Network-aware data management
      November 2011
      84 pages
      ISBN:9781450311328
      DOI:10.1145/2110217

      Copyright © 2011 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 14 November 2011

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate14of23submissions,61%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader