Results 91 to 100 of about 31,177 (208)
Factoring out ordered sections to expose thread-level parallelism [PDF]
With the rise of multi-core processors, researchers are taking a new look at extending the applicability auto-parallelization techniques. In this paper, we identify a dependence pattern on which autoparallelization currently fails.
De Bosschere, Koen +2 more
core +1 more source
Running scientific codes on amazon EC2: a performance analysis of five high-end instances
Amazon Web Services (AWS) is a well-known public Infrastructure-as-a-Service (IaaS) provider whose Elastic Computing Cloud (EC2) o ering includes some instances, known as cluster instances, aimed at High-Performance Computing (HPC) applications.
Roberto R. Expósito +4 more
doaj
Unified Schemes for Directive-Based GPU Offloading
GPU is the dominant accelerator device due to its high performance and energy efficiency. Directive-based GPU offloading using OpenACC or OpenMP target is a convenient way to port existing codes originally developed for multicore CPUs.
Yohei Miki, Toshihiro Hanawa
doaj +1 more source
Peer ...
Ayguadé Parra, Eduard +8 more
openaire +3 more sources
Multilevel Parallelization of AutoDock 4.2
Background Virtual (computational) screening is an increasingly important tool for drug discovery. AutoDock is a popular open-source application for performing molecular docking, the prediction of ligand-receptor interactions.
Norgan Andrew P +4 more
doaj +1 more source
Performance Evaluation of a Hybrid Programming Model for RSDFT on T2K Open Supercomputer
Non-uniform memory access (NUMA) systems, where each processor has its own memory, have been popular platform in high-end computing. While some early studies had reported that a flat-MPI programming model outperformed an OpenMP/MPI hybrid programming ...
Miwako Tsuji, Mitsuhisa Sato
doaj +1 more source
Implementing OpenMP 4.0 for the NVIDIA PTX architecture in GCC compiler
The paper describes the approach used in implementing OpenMP offloading to NVIDIA accelerators in GCC. Offloading refers to a new capability in OpenMP 4.0 specification update that allows the programmer to specify regions of code that should be executed ...
A. V. Monakov, V. A. Ivanishin
doaj +1 more source
The recent surge in high-performance computing (HPC) demands, particularly with the advent of Exascale supercomputers, has highlighted the need for robust parallel systems.
Salwa Saad +4 more
doaj +1 more source
Defect Detection and Correction in OpenMP: A Static Analysis and Machine Learning-Based Solution
Concurrency defects such as race conditions, deadlocks, and improper synchronization remain a critical challenge in developing reliable OpenMP-based parallel applications.
Norah A. Al-Johany +4 more
doaj +1 more source
Nonlinear Wave Simulation on the Xeon Phi Knights Landing Processor
We consider an interesting from computational point of view standing wave simulation by solving coupled 2D perturbed Sine-Gordon equations. We make an OpenMP realization which explores both thread and SIMD levels of parallelism.
Hristov Ivan +2 more
doaj +1 more source

