Results 41 to 50 of about 186,522 (338)
Parallelism Analysis of Subroutine-Level Speculative in HPEC [PDF]
Effective application of Thread-Level Speculation(TLS) technology can improve the hardware resource utilization of multicore chips,and has acquired successful results in automatic parallelization of multiple serial applications.However,it lacks efficient
WANG Xinyi, WANG Yaobin, LI Ling, YANG Yang, BU Deqing, LIU Zhiqin
doaj +1 more source
Data-parallel polygonization [PDF]
Data-parallel algorithms are presented for polygonizing a collection of line segments represented by a data-parallel bucket PMR quadtree, a data-parallel R-tree, and a data-parallel R+-tree. Such an operation is useful in a geographic information system (GIS).
Erik G. Hoel, Hanan Samet
openaire +1 more source
Collective Communication Performance Evaluation for Distributed Deep Learning Training
In distributed deep learning, the improper use of the collective communication library can lead to a decline in deep learning performance due to increased communication time.
Sookwang Lee, Jaehwan Lee
doaj +1 more source
Sorting algorithm acceleration based on CPU-FPGA heterogeneous system
Traditional sorting methods are mainly implemented in software serial mode, including bubble sorting, selective sorting and so on. These algorithms often use sequential comparison, and the operation time complexity is relatively high.
Kou Yuanbo +3 more
doaj +1 more source
Impact of Design Decisions on Performance of Embarrassingly Parallel .NET Database Application
The implementation of parallel applications is always a challenge. It embraces many distinctive design decisions that are to be taken. The paper presents issues of parallel processing with use of .NET applications and popular Database Management Systems (
Piotr Karwaczyński +6 more
doaj +1 more source
The discursive interaction consists of dialogism and parallelism. This research is aimed to analyze the discursive interaction that occurs in Mappettu Ada event.
Zulkhaeriyah Zulkhaeriyah
doaj +1 more source
Efficient Computerized-Tomography Reconstruction Using Low-Cost FPGA-DSP Chip [PDF]
In this paper, filtered back-projection algorithm is optimally implemented using low-cost Spartan 3A-DSP 3400 chip. The optimization enables parallel implementation.
Bassam A. Abo-Elftooh +2 more
doaj +1 more source
Thread partitioning and value prediction for exploiting speculative thread-level parallelism [PDF]
Speculative thread-level parallelism has been recently proposed as a source of parallelism to improve the performance in applications where parallel threads are hard to find.
González Colás, Antonio María +2 more
core +2 more sources
A hybrid MPI-OpenMP scheme for scalable parallel pseudospectral computations for fluid turbulence [PDF]
A hybrid scheme that utilizes MPI for distributed memory parallelism and OpenMP for shared memory parallelism is presented. The work is motivated by the desire to achieve exceptionally high Reynolds numbers in pseudospectral computations of fluid ...
Mininni, Pablo D. +3 more
core +2 more sources
An all‐in‐one analog AI accelerator is presented, enabling on‐chip training, weight retention, and long‐term inference acceleration. It leverages a BEOL‐integrated CMO/HfOx ReRAM array with low‐voltage operation (<1.5 V), multi‐bit capability over 32 states, low programming noise (10 nS), and near‐ideal weight transfer.
Donato Francesco Falcone +11 more
wiley +1 more source

