The advent of new technologies brings revolutions in the fields of VLSI design and high performance computing. On one hand, the increasing number of processing elements, both in on-chip multi-core systems and supercomputer systems, demands high bandwidth communications. On the other hand, the performance of the system, usually measured by the latency and power consumption, is gradually being dominated by the interconnection networks. These facts raise challenges in synthesizing and optimizing interconnection networks.
In this dissertation, we study methodologies and algorithms to perform the interconnection network synthesis and optimization in both on-chip networks and supercomputer systems. We explore a wide range of network topologies and physical implementations, and evaluate the performance of multi-commodity flow (MCF) algorithms. We design efficient approximation schemes to solve different variations of MCF problems, which incorporate different practical constraints. The automated design flows discover much larger design space than the traditional methods and therefore achieve promising results.
In the study of Network-on-Chip (NoC), we are optimizing the communication latency and power consumption, which are two competing design objectives. With an improved fully polynomial approximation algorithm, power optimal design of a structured 8 × 8 NoC can be found for given average latency constraints with certain communication bandwidth requirements. Our methodology explores a large number of topologies, introduces a variety of wire styles into NoC design, and incorporates latency constraints and power minimization objectives into a unified MCF model, with simultaneous optimization on network topologies, physical embedding, and interconnect wire styles. The results demonstrate the strengths of the optimized networks and indicate the clear trend of power and latency tradeoffs.
In the synthesis and optimization of networks in supercomputer systems, we use the packaging framework of the Blue Gene/L supercomputer as an example to demonstrate the advantages of our design flow, which has incorporated real design issues, such as board dimensions and pin numbers. Using real benchmark traces, the experiments show that the best topologies identified by our algorithm can achieve better average latency compared to the existing 3-dimensional torus networks.
Recommendations
Performance Analysis of k-ary n-cube Interconnection Networks
VLSI communication networks are wire-limited, i.e. the cost of a network is not a function of the number of switches required, but rather a function of the wiring density required to construct the network. Communication networks of varying dimensions ...
Design Exploration of Optical Interconnection Networks for Chip Multiprocessors
HOTI '08: Proceedings of the 2008 16th IEEE Symposium on High Performance InterconnectsThe Network-on-Chip (NoC) paradigm has emerged as a promising solution for providing connectivity among the increasing number of cores that get integrated into both systems-on chip(SoC) and chip multiprocessors (CMP). In future high performance CMPs, ...