Results 51 to 60 of about 46,626 (281)

From Physics Model to Results: An Optimizing Framework for Cross-Architecture Code Generation [PDF]

open access: yes, 2013
Starting from a high-level problem description in terms of partial differential equations using abstract tensor notation, the Chemora framework discretizes, optimizes, and generates complete high performance codes for a wide range of compute ...
Blazewicz, Marek   +8 more
core   +4 more sources

Accelerating Nested Conditionals on CGRA With Tag-Based Full Predication Method

open access: yesIEEE Access, 2020
CGRA (Coarse-grained Reconfigurable Architecture) has been widely considered as one of the most promising computing architectures to exploit spatial parallelism.
Jiang Sha   +3 more
doaj   +1 more source

Thread partitioning and value prediction for exploiting speculative thread-level parallelism [PDF]

open access: yes, 2004
Speculative thread-level parallelism has been recently proposed as a source of parallelism to improve the performance in applications where parallel threads are hard to find.
González Colás, Antonio María   +2 more
core   +2 more sources

A Pipeline-Based ODE Solving Framework

open access: yesIEEE Access
The traditional parallel solving methods of ordinary differential equations (ODE) are mainly classified into task-parallelism, data-parallelism, and instruction-level parallelism.
Ruixia Cao, Shangjun Hou, Lin Ma
doaj   +1 more source

goSLP: Globally Optimized Superword Level Parallelism Framework

open access: yes, 2018
Modern microprocessors are equipped with single instruction multiple data (SIMD) or vector instruction sets which allow compilers to exploit superword level parallelism (SLP), a type of fine-grained parallelism.
Amarasinghe, Saman, Mendis, Charith
core   +1 more source

RPPM : Rapid Performance Prediction of Multithreaded workloads on multicore processors [PDF]

open access: yes, 2019
Analytical performance modeling is a useful complement to detailed cycle-level simulation to quickly explore the design space in an early design stage. Mechanistic analytical modeling is particularly interesting as it provides deep insight and does not ...
Akram, Shoaib   +3 more
core   +2 more sources

Vectorization of Program Code Containing Low Probability Regions in Computational Geometry Problems

open access: yesСовременные информационные технологии и IT-образование, 2022
Improving application performance is an important practical task for supercomputer calculations. Along with parallelization of calculations between cluster nodes (for example, using MPI tools), as well as multithreaded programming (for example, using ...
Alexey Rybakov
doaj   +1 more source

Design Principles for Sparse Matrix Multiplication on the GPU

open access: yes, 2018
We implement two novel algorithms for sparse-matrix dense-matrix multiplication (SpMM) on the GPU. Our algorithms expect the sparse input in the popular compressed-sparse-row (CSR) format and thus do not require expensive format conversion.
A Tiskin   +11 more
core   +1 more source

Quantifying the benefits of SPECint distant parallelism in simultaneous multithreading architectures [PDF]

open access: yes, 1999
We exploit the existence of distant parallelism that future compilers could detect and characterise its performance under simultaneous multithreading architectures.
Ayguadé Parra, Eduard   +4 more
core   +1 more source

Home - About - Disclaimer - Privacy