Local Basic Linear Algebra Subroutines (lblas) for the Cm-5/5e

D Kramer,SL Johnsson,Y Hu
DOI: https://doi.org/10.1177/109434209601000403
1996-01-01
Abstract:The Connection Machine Scientific Software Library (CMSSL) is a library of scientific routines designed for distributed memory architectures, The basic linear algebra subroutines (BLAS) of the CMSSL have been implemented as a two-level structure to exploit optimizations local to nodes and across nodes. This paper presents the implementation considerations and performance of the local BLAS, or BLAS local to each node of the system. A wide variety of loop structures and unrollings have been implemented in order to achieve a uniform and high performance, irrespective of the data layout in node memory. The CMSSL is the only existing high performance library capable of supporting both the data parallel and message-passing modes of programming a distributed memory computer. The implications of implementing BLAS on distributed memory computers are considered in this light.
What problem does this paper attempt to address?