A Methodology for Design and Applications of Parallel Computers
The Graduate School, Stony Brook University: Stony Brook, NY.
This dissertation focuses on two aspects of parallel computing, i.e., development and applications of parallel computers. First, we introduce a new technique by strategically interlacing bypass rings to torus (iBT network) for generating more efficient grid-like interconnection networks. Second, we derive an algebraic formulation of mapping tasks to parallel computers with complex network architectures for realizing their potentials. Compared to the widely adopted mesh and torus network topologies, our new iBT network has many superior characteristics: (1) its network diameter and average node-to-node distances are significantly reduced; (2) the simplicity of a grid-like layout is preserved; (3) it outperforms other bypass torus networks; and (4) it has far more flexible network sizes. A mathematical model is further devised to analyze the dependencies of the iBT network diameters on bypass schemes, thus enabling discovery of a class of the most efficient bypass schemes for a given node degree and network size. Additionally, a pipelined broadcast algorithm for the all-port nodal ability is present and analyzed, demonstrating the collective performance. The iBT networks is finding broad applications in designing higher-dimensional and larger-scale parallel computers as the 3-D torus networks have done for parallel computers with fewer processors. We have developed a new formulation for the task mapping in efficient application of a parallel computer with complex networks such as iBT. The fact that the supply matrix, characterizing the network topologies, exhibits enormous symmetries allows us the transformation of the demand matrix measuring the communication demands of applications to derive a hop-byte objective function in terms of the eigen properties. This new eigen-based formulation dramatically reduces the complexity of finding the solutions for the objective functions from the conventional and widely adopted graph theory-based formulations. Numerical experiments with simulated annealing demonstrate such gains. This formulation enables solution of critical task mapping problems on large-scale parallel computers.