Results 281 to 290 of about 39,047 (336)
Some of the next articles are maybe not open access.

Porting a Legacy CUDA Stencil Code to oneAPI

IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum, 2020
Recently, Intel released the oneAPI programming environment. With Data Parallel C++(DPC++), oneAPI enables codes to target multiple hardware architectures like multi-core CPUs, GPUs, and even FPGAs or other hardware using a single source.
Steffen Christgau, T. Steinke
semanticscholar   +1 more source

?????????????????????????? ???????????????????? CUDA ?????? ?????????????????? ???????????????????? ?? ?????????????? ???????????????????? ????????????????

2023
The paper focuses on the problem of chemical kinetics, calculation of changes in the concentration of substances in the reactions over time, and creation of a mass kinetic solver to solve the problem using modern parallelization technologies. A mathematical model of variation in the concentration of substances in a system with a one-dimensional ...
????????????????????, ??.C.   +1 more
openaire   +1 more source

HIPCL: Tool for Porting CUDA Applications to Advanced OpenCL Platforms Through HIP

International Workshop on OpenCL, 2020
Heterogeneous-compute Interface for Portability (HIP), is an open-source C++ runtime API and a kernel language. It is designed to be compatible with CUDA and to deliver close to native performance on CUDA platforms while exposing additional low-level ...
Michal Babej, P. Jääskeläinen
semanticscholar   +1 more source

Learning CUDA

Proceedings of the ACM international conference companion on Object oriented programming systems languages and applications companion, 2010
Whereas the fastest supercomputer of 1998 could compute 1.34 trillion double precision floating point operations per second (TFLOPS) [7], today's consumer-level (sub-$500) graphics cards such as the NVidia GeForce GTX 480 can compute 1.35 TFLOPS (single precision) [8].
Nate Anderson   +2 more
openaire   +1 more source

CUDAsmith: A Fuzzer for CUDA Compilers

Annual International Computer Software and Applications Conference, 2020
CUDA is a parallel computing platform and programming model for the graphics processing unit (GPU) of NVIDIA. With CUDA programming, general purpose computing on GPU (GPGPU) is possible. However, the correctness of CUDA programs relies on the correctness
B. Jiang   +6 more
semanticscholar   +1 more source

Investigation of Parallel Data Processing Using Hybrid High Performance CPU + GPU Systems and CUDA Streams

Computing and informatics, 2020
. The paper investigates parallel data processing in a hybrid CPU + GPU(s) system using multiple CUDA streams for overlapping communication and computations.
Paweł Czarnul
semanticscholar   +1 more source

CUDA-NP

Proceedings of the 19th ACM SIGPLAN symposium on Principles and practice of parallel programming, 2014
Parallel programs consist of series of code sections with different thread-level parallelism (TLP). As a result, it is rather common that a thread in a parallel program, such as a GPU kernel in CUDA programs, still contains both se-quential code and parallel loops.
Yi Yang, Huiyang Zhou
openaire   +1 more source

Debugging CUDA

Proceedings of the 13th annual conference companion on Genetic and evolutionary computation, 2011
During six months of intensive nVidia CUDA C programming many bugs were created. We pass on the software engineering lessons learnt, particularly those relevant to parallel general-purpose computation on graphics hardware GPGPU.
openaire   +1 more source

CUDA-MAFFT: Accelerating MAFFT on CUDA-enabled graphics hardware

2013 IEEE International Conference on Bioinformatics and Biomedicine, 2013
Multiple sequence alignment (MSA) constitutes an extremely powerful tool for many biological applications including phylogenetic tree estimation, secondary structure prediction, and critical residue identification. However, aligning large biological sequences with popular tools such as MAFFT requires long runtimes on sequential architectures.
Xiangyuan Zhu, Kenli Li
openaire   +1 more source

Parallelized combined finite‐discrete element (FDEM) procedure using multi‐GPU with CUDA

International journal for numerical and analytical methods in geomechanics (Print), 2019
This paper focuses on the efficiency of finite discrete element method (FDEM) algorithmic procedures in massive computers and analyzes the time‐consuming part of contact detection and interaction computations in the numerical solution.
Quansheng Liu, Weiqin Wang, Hao Ma
semanticscholar   +1 more source

Home - About - Disclaimer - Privacy