Results 241 to 250 of about 308,238 (274)
Some of the next articles are maybe not open access.
MVAPICH2-GPU: optimized GPU to GPU communication for InfiniBand clusters
Computer Science - Research and Development, 2011Data parallel architectures, such as General Purpose Graphics Units (GPGPUs) have seen a tremendous rise in their application for High End Computing. However, data movement in and out of GPGPUs remain the biggest hurdle to overall performance and programmer productivity.
Hao Wang +5 more
openaire +1 more source
GPU-Ether: GPU-native Packet I/O for GPU Applications on Commodity Ethernet
IEEE INFOCOM 2021 - IEEE Conference on Computer Communications, 2021Despite the advent of various network enhancement technologies, it is yet a challenge to provide high-performance networking for GPU-accelerated applications on commodity Ethernet. Kernel-bypass I/O, such as DPDK or netmap, which is normally optimized for host memory-based CPU applications, has limitations on improving the performance of GPU ...
Changue Jung +4 more
openaire +1 more source
Proceedings of the 16th International Workshop on Software and Compilers for Embedded Systems, 2013
Us have evolved to programmable, energy efficient compute accelerators for massively parallel applications. Still, compute power is lost in many applications because of cycles spent on data movement and control instead of computations on actual data. Additional cycles can be lost as well on pipeline stalls due to long latency operations.
Braak, van den, G.J.W., Corporaal, H.
openaire +1 more source
Us have evolved to programmable, energy efficient compute accelerators for massively parallel applications. Still, compute power is lost in many applications because of cycles spent on data movement and control instead of computations on actual data. Additional cycles can be lost as well on pipeline stalls due to long latency operations.
Braak, van den, G.J.W., Corporaal, H.
openaire +1 more source
GPU-SM: shared memory multi-GPU programming
Proceedings of the 8th Workshop on General Purpose Processing using GPUs, 2015Discrete GPUs in modern multi-GPU systems can transparently access each other's memories through the PCIe interconnect. Future systems will improve this capability by including better GPU interconnects such as NVLink. However, remote memory access across GPUs has gone largely unnoticed among programmers, and multi-GPU systems are still programmed like ...
Javier Cabezas +4 more
openaire +1 more source
Proceedings of the 9th conference on Computing Frontiers, 2012
Scalable heterogeneous computing (SHC) architectures are emerging as a response to new requirements for low cost, power efficiency, and high performance. For example, numerous contemporary HPC systems are using commodity Graphical Processing Units (GPU) to supplement traditional multicore processors.
Vinod Tipparaju, Jeffrey S. Vetter
openaire +1 more source
Scalable heterogeneous computing (SHC) architectures are emerging as a response to new requirements for low cost, power efficiency, and high performance. For example, numerous contemporary HPC systems are using commodity Graphical Processing Units (GPU) to supplement traditional multicore processors.
Vinod Tipparaju, Jeffrey S. Vetter
openaire +1 more source
Proceedings of the ACM International Conference on Supercomputing, 2019
Future High-Performance Computing (HPC) systems will likely be composed of accelerator-dense heterogeneous computers because accelerators are able to deliver higher performance at lower costs, socket counts and energy consumption. Such accelerator-dense nodes pose a reliability challenge because preserving a large amount of state within accelerators ...
Kyushick Lee +5 more
openaire +1 more source
Future High-Performance Computing (HPC) systems will likely be composed of accelerator-dense heterogeneous computers because accelerators are able to deliver higher performance at lower costs, socket counts and energy consumption. Such accelerator-dense nodes pose a reliability challenge because preserving a large amount of state within accelerators ...
Kyushick Lee +5 more
openaire +1 more source
���������������������������� ������������������ ������������ ������ ���������������������������� GPU
2011( )
openaire +1 more source
GPU-to-GPU and Host-to-Host Multipattern String Matching on a GPU
IEEE Transactions on Computers, 2013We develop GPU adaptations of the Aho-Corasick and multipattern Boyer-Moore string matching algorithms for the two cases GPU-to-GPU (input to the algorithms is initially in GPU memory and the output is left in GPU memory) and host-to-host (input and output are in the memory of the host CPU). For the GPU-to-GPU case, we consider several refinements to a
Xinyan Zha, Sartaj Sahni
openaire +1 more source

