Joy Y. Cheng, Daniel P. Sanders, et al.
SPIE Advanced Lithography 2008
We give nearly optimal algorithms for matrix transpose on meshes with wormhole and XY routing and with a 1-port or 2-port communication model. For an N x N mesh, where N = 3 · 2n and each mesh node has a submatrix of size m to be transposed, our best algorithm takes about Nm/3.27 time steps. The lower bound is Nm/3.414. While there is no previously known algorithm for matrix transpose on meshes with wormhole and XY routing, a naive algorithm, which is naturally adapted from the well-known Recursive Exchange Algorithm, has a complexity of about Nm. That is our best algorithm improves over the naive algorithm by about a factor of 3.27, and is about a factor of 3.414/3.27 of the lower bound. © 1998 Elsevier Science B.V. All rights reserved.
Joy Y. Cheng, Daniel P. Sanders, et al.
SPIE Advanced Lithography 2008
Jaione Tirapu Azpiroz, Alan E. Rosenbluth, et al.
SPIE Photomask Technology + EUV Lithography 2009
Ziv Bar-Yossef, T.S. Jayram, et al.
Journal of Computer and System Sciences
Matthew A Grayson
Journal of Complexity