For the current REF see the REF 2021 website REF 2021 logo

Output details

11 - Computer Science and Informatics

University of Bedfordshire

Return to search Previous output Next output
Output 0 of 0 in the submission
Article title

Cache-oblivious matrix algorithms in the age of multicores and many cores

Type
D - Journal article
Title of journal
Concurrency and Computation: Practice and Experience
Article number
-
Volume number
online
Issue number
-
First page of article
n/a
ISSN of journal
15320626
Year of publication
2012
URL
-
Number of additional authors
1
Additional information

<12> We highlight the issue of upcoming wider single-instruction, multiple-data units as well as steadily increasing core counts on contemporary and future processor architectures. Our matrix multiplication and LU decomposition code TifaMMy has been ported and tuned on four architectures: SGI's UltraViolet distributed shared-memory machine, Intel's Xeon architecture Sandy Bridge, AMD's Bulldozer architecture, and Intel's Xeon Phi architecture. We also comment on the feasibility graphics processing units. Results are discussed and compared with vendors’ architecture-specific and optimised libraries, namely Math Kernel Library (MKL) and AMD Core Math Library (ACML), TifaMMy executes with equally efficient performance on all four architectures underlining its generic and cache-oblivious properties.

Interdisciplinary
-
Cross-referral requested
-
Research group
T - Centre for Research in Distributed Technologies
Citation count
1
Proposed double-weighted
No
Double-weighted statement
-
Reserve for a double-weighted output
No
Non-English
No
English abstract
-