You are in : Home » Results & submissions » Select UOA » 11 - Computer Science and Informatics » View submission: University of Bedfordshire » Outputs » Detail

Output details

11 - Computer Science and Informatics

University of Bedfordshire

Return to search Previous output Next output

Output 0 of 0 in the submission

Article title

Cache-oblivious matrix algorithms in the age of multicores and many cores

Type

D - Journal article

DOI

10.1002/cpe.2974

Title of journal

Concurrency and Computation: Practice and Experience

Article number

Volume number

online

Issue number

First page of article

n/a

ISSN of journal

15320626

Year of publication

2012

URL

Number of additional authors

Additional information

<12> We highlight the issue of upcoming wider single-instruction, multiple-data units as well as steadily increasing core counts on contemporary and future processor architectures. Our matrix multiplication and LU decomposition code TifaMMy has been ported and tuned on four architectures: SGI's UltraViolet distributed shared-memory machine, Intel's Xeon architecture Sandy Bridge, AMD's Bulldozer architecture, and Intel's Xeon Phi architecture. We also comment on the feasibility graphics processing units. Results are discussed and compared with vendors’ architecture-specific and optimised libraries, namely Math Kernel Library (MKL) and AMD Core Math Library (ACML), TifaMMy executes with equally efficient performance on all four architectures underlining its generic and cache-oblivious properties.

Interdisciplinary

Cross-referral requested

Research group

T - Centre for Research in Distributed Technologies

Citation count

Proposed double-weighted

Double-weighted statement

Reserve for a double-weighted output

Non-English

English abstract