Abstract
Folded-Clos networks, also known as fat-trees, have been widely used as interconnects in large-scale high-performance computing clusters. Although users often treat such interconnects as replacements of nonblocking crossbar switches that can carry out any permutation communication without contention, the networking capability of such interconnects without a centralized controller in computer communication environments is not well understood. In this article, we investigate nonblocking two-level folded-Clos networks with deterministic single-path routing, but no centralized controller, and establish the nonblocking condition. The results indicate that nonblocking two-level folded-Clos networks without a centralized controller are much more expensive to construct than the traditional nonblocking networks in the telecommunication environment. Practical two-level folded-Clos based interconnects are blocking. For such interconnects, we establish the lower bound for worst-case contention for permutations with any deterministic single-path routing scheme, show that existing routing schemes perform poorly in terms of worst-case contention for permutations, present a routing scheme that achieves the theoretical optimal, and empirically compare the performance of existing schemes with the optimal routing scheme. The techniques developed for two-level folded-Clos networks are further extended for the general fat-trees of any heights.
- Y. Aydogan, C. B. Stunkel, C. Aykanat, and B. Abali. 1996. Adaptive source routing in multistage interconnection networks. In Proceedings of the 10th International Parallel Processing Symposium. 258--267. DOI:http://dx.doi.org/10.1109/IPPS.1996.508067 Google ScholarDigital Library
- V. E. Benes. 1962. On rearrangeable three-stage connecting networks. Bell System Technical Journal 41, 5, 1481--1492.Google ScholarCross Ref
- V. E. Benes. 1965. Mathematical Theory of Connecting Networks and Telephone Traffic. Vol. 17. Academic Press.Google Scholar
- Jehoshua Bruck, Ching-Tien Ho, Eli Upfal, Shlomo Kipnis, and Derrick Weathersby. 1997. Efficient algorithms for all-to-all communications in multiport message-passing systems. IEEE Transactions on Parallel and Distributed Systems 8, 11, 1143--1156. DOI:http://dx.doi.org/10.1109/71.642949 Google ScholarDigital Library
- Charles Clos. 1953. A study of non-blocking switching networks. Bell System Technical Journal 32, 2, 406--424. DOI:http://dx.doi.org/10.1002/j.1538-7305.1953.tb01433.xGoogle ScholarCross Ref
- J. Flich, M. P. Malumbres, P. Lopez, and J. Duato. 2000. Improving routing performance in Myrinet networks. In Proceedings of the 14th International Parallel and Distributed Processing Symposium (IPDPS'00). 27--32. DOI:http://dx.doi.org/10.1109/IPDPS.2000.845961 Google ScholarDigital Library
- P. Geoffray and T. Hoefler. 2008. Adaptive routing strategies for modern high performance networks. In Proceedings of the 16th IEEE Symposium on High Performance Interconnects. 165--172. DOI:http://dx.doi. org/10.1109/HOTI.2008.21 Google ScholarDigital Library
- C. Gomez, F. Gilabert, M. E. Gomez, P. Lopez, and J. Duato. 2007. Deterministic versus adaptive routing in fat-trees. In Proceedings of the 21st IEEE International Parallel and Distributed Processing Symposium. 1--8. DOI:http://dx.doi.org/10.1109/IPDPS.2007.370482Google Scholar
- Ronald I. Greenberg and C. E. Leiserson. 1985. Randomized routing on fat-tress. In Proceedings of the 26th Annual Symposium on Foundations of Computer Science. 241--249. DOI:http://dx.doi.org/10.1109/ SFCS.1985.46 Google ScholarDigital Library
- T. Hoefler, T. Schneider, and A. Lumsdaine. 2008. Multistage switches are not crossbars: Effects of static routing in high-performance networks. In Proceedings of the 2008 IEEE International Conference on Cluster Computing. 116--125. DOI:http://dx.doi.org/10.1109/CLUSTR.2008.4663762Google ScholarCross Ref
- InfiniBandTM Trade Association. 2004. InfiniBandTM Architecture Specification. Release 1.2.Google Scholar
- G. Johnson, D. K. Kerbyson, and M. Lang. 2008. Optimization of InfiniBand for scientific applications. In Proceedings of the 2008 IEEE International Symposium on Parallel and Distributed Processing (IPDPS'08). 1--8. DOI:http://dx.doi.org/10.1109/IPDPS.2008.4536345Google ScholarCross Ref
- Ajai Kapoor and Romeo Rizzi. 2000. Edge-coloring bipartite graphs. Journal of Algorithms 34, 2, 390--396. DOI:http://dx.doi.org/10.1006/jagm.1999.1058 Google ScholarDigital Library
- John Kim, William J. Dally, and Dennis Abts. 2006. Adaptive routing in high-radix Clos network. In Proceedings of the ACM/IEEE SC 2006 Conference. 7--7. DOI:http://dx.doi.org/10.1109/SC.2006.10 Google ScholarDigital Library
- C. E. Leiserson. 1985. Fat-trees: Universal networks for hardware-efficient supercomputing. IEEE Transactions on Computing C-34, 10, 892--901. DOI:http://dx.doi.org/10.1109/TC.1985.6312192 Google ScholarDigital Library
- Xuan-Yi Lin, Yeh-Ching Chung, and Tai-Yi Huang. 2004. A multiple LID routing scheme for fat-tree-based InfiniBand networks. In Proceedings of the 18th International Parallel and Distributed Processing Symposium. 11. DOI:http://dx.doi.org/10.1109/IPDPS.2004.1302913Google Scholar
- P. Lopez, J. Flich, and J. Duato. 2001. Deadlock-free routing in InfiniBandTM through destination renaming. In Proceedings of the 2001 International Conference on Parallel Processing. 427--434. DOI:http://dx.doi.org/10.1109/ICPP.2001.952089 Google ScholarDigital Library
- Wickus Nienaber, Xin Yuan, and Zhenhai Duan. 2009. LID assignment in InfiniBand networks. IEEE Transactions on Parallel and Distributed Systems 20, 4, 484--497. DOI:http://dx.doi.org/10.1109/TPDS. 2008.144 Google ScholarDigital Library
- S. R. Ohring, M. Ibel, S. K. Das, and M. J. Kumar. 1995. On generalized fat trees. In Proceedings of the 9th International Parallel Processing Symposium. 37--44. DOI:http://dx.doi.org/10.1109/IPPS.1995.395911 Google ScholarDigital Library
- F. Petrini and M. Vanneschi. 1997. K-ary N-trees: High performance networks for massively parallel architectures. In Proceedings of the 11th International Parallel Processing Symposium. 87--93. DOI:http://dx.doi.org/10.1109/IPPS.1997.580853 Google ScholarDigital Library
- G. Rodriguez, C. Minkenberg, R. Beivide, R. P. Luijten, J. Labarta, and M. Valero. 2009. Oblivious routing schemes in extended generalized fat tree networks. In Proceedings of the 2009 IEEE International Conference on Cluster Computing and Workshops. 1--8. DOI:http://dx.doi.org/10.1109/CLUSTR.2009.5289145Google ScholarCross Ref
- T. Schneider, T. Hoefler, and A. Lumsdaine. 2009. ORCS: An Oblivious Routing Congestion Simulator. TR-675. Indiana University Computer Science.Google Scholar
- A. Singh. 2005. Load-Balanced Routing in Interconnection Networks. Ph.D. Dissertation. Stanford University, Stanford, CA.Google Scholar
- Yuanyuan Yang and Jianchao Wang. 1999. Wide-sense nonblocking Clos networks under packing strategy. IEEE Transactions on Computers 48, 3, 265--284. DOI:http://dx.doi.org/10.1109/12.754994 Google ScholarDigital Library
- Xin Yuan. 2011. On nonblocking folded-Clos networks in computer communication environments. In Proceedings of the 2011 IEEE International Parallel Distributed Processing Symposium (IPDPS'11). 188--196. DOI:http://dx.doi.org/10.1109/IPDPS.2011.27 Google ScholarDigital Library
- X. Yuan, W. Nienaber, Z. Duan, and R. Melhem. 2009. Oblivious routing in fat-tree based system area networks with uncertain traffic demands. IEEE/ACM Transactions on Networking 17, 5, 1439--1452. DOI:http://dx.doi.org/10.1109/TNET.2009.2012853 Google ScholarDigital Library
- Eitan Zahavi, Greg Johnson, Darren J. Kerbyson, and Michael Lang. 2010. Optimized InfiniBand fat-tree routing for shift all-to-all communication patterns. Concurrency and Computation: Practice and Experience 22, 2, 217--231. Google ScholarDigital Library
- Eitan Zahavi. 2012. Fat-tree routing and node ordering providing contention free traffic for MPI global collectives. Journal of Parallel and Distributed Computing 72, 11, 1423--1432. Google ScholarDigital Library
Index Terms
- On Folded-Clos Networks with Deterministic Single-Path Routing
Recommendations
A new routing scheme for Jellyfish and its performance with HPC workloads
SC '13: Proceedings of the International Conference on High Performance Computing, Networking, Storage and AnalysisThe jellyfish topology where switches are connected using a random graph has recently been proposed for large scale data-center networks. It has been shown to offer higher bisection bandwidth and better permutation throughput than the corresponding fat-...
A new routing algorithm for multirate rearrangeable Clos networks
In this paper, we study the problem of finding routing algorithms on the multirate rearrangeable Clos networks which use as few number of middle-stage switches as possible. We propose a new routing algorithm called the “grouping algorithm”. This is a ...
Adaptive routing in Clos networks
ICCD '95: Proceedings of the 1995 International Conference on Computer Design: VLSI in Computers and ProcessorsWe describe a method of controlling a three-stage Clos nonblocking switch where "speculative" self-routing over the Clos fabric is augmented with reservations over a control network that connects controllers in the input and output stages of the switch. ...
Comments