Results 211 to 220 of about 21,831,698 (267)
Some of the next articles are maybe not open access.

Parallel Image Processing with the Block Data Parallel Architecture

IBM Journal of Research and Development, 1996
Many digital signal and image processing algorithms can be speeded up by executing them in parallel on multiple processors. The speed of parallel execution is limited by the need for communication and synchronization between processors. In this paper, we present a paradigm for parallel processing that we call the block data flow paradigm (BDFP).
W.E. Alexander   +2 more
openaire   +1 more source

Function-Parallel Computation in a Data-Parallel Environment

1993 International Conference on Parallel Processing - ICPP'93 Vol2, 1993
Asynchromus problems are those which may be decomposed into a set of independenr sub-tasks which are suitable for concurrent execution. Th function paraIIeIism of these problems cannot normally be direcrly expressed using the data-parallel programming model.
Alex L. Cheung, Anthony P. Reeves
openaire   +1 more source

Loop Parallelism Maximization for Multimedia Data Processing in Mobile Vehicular Clouds

IEEE Transactions on Cloud Computing, 2019
Mobile vehicular cloud has become popular with the rapid development of cloud computing and mobile computing. Nested loops are usually the most critical part in multimedia and high performance Digital Signal Processing (DSP) systems which are widely used
Meikang Qiu, Wenyun Dai, A. Vasilakos
semanticscholar   +1 more source

Data-Parallel Programming on A Reconfigurable Parallel Computer

IETE Technical Review, 1998
We present a preprocessor which converts programs written in a data parallel version of C into standard C to run on CDOT CHiPPS, a reconfigurable parallel computer. The main contribution is the development of rewriting methods and an optimizing protocol for inter-processor communication during barrier synchronization.
Ranjan K Sen   +3 more
openaire   +1 more source

Mechanisms for Parallel Data Transport

2012
Evolving paradigms of parallel transport mechanisms are necessary to satisfy the ever increasing need of high performing communication systems. Parallel transport mechanisms can be described as a technique to send several data simultaneously using several parallel channels.
Okyere Benya J.   +4 more
openaire   +2 more sources

Exploitation of control parallelism in data parallel algorithms

Proceedings Frontiers '95. The Fifth Symposium on the Frontiers of Massively Parallel Computation, 2002
This paper considers the matrix decomposition A=LDL/sup T/, as a vehicle to explore the improvement in performance obtainable through the execution of multiple streams of control on SIMD architectures. Several methods for partitioning the SIMD array are considered.
V. Garg, D.E. Schimmel
openaire   +1 more source

PipeDream: generalized pipeline parallelism for DNN training

Symposium on Operating Systems Principles, 2019
DNN training is extremely time-consuming, necessitating efficient multi-accelerator parallelization. Current approaches to parallelizing training primarily use intra-batch parallelization, where a single iteration of training is split over the available ...
D. Narayanan   +7 more
semanticscholar   +1 more source

USP: A Unified Sequence Parallelism Approach for Long Context Generative AI

arXiv.org
Sequence parallelism (SP), which divides the sequence dimension of input tensors across multiple computational devices, is becoming key to unlocking the long-context capabilities of generative AI models.
Jiarui Fang, Shangchun Zhao
semanticscholar   +1 more source

Data Parallel C++

Proceedings of the International Workshop on OpenCL, 2020
SYCLâ„¢ is a heterogeneous programming framework built on top of modern C++. Data Parallel C++, recently introduced as part of Intel's oneAPI project, is an implementation of SYCL. Data Parallel C++ (DPC++) is being developed as an open-source project on top of Clang and LLVM.
Ben Ashbaugh   +7 more
openaire   +1 more source

On the parallelism of data

1994
This article presents a data-parallel language, which has been designed around the concepts of relations and reduction operations. Many parallel machines provide hardware support for reduction operations (such as summing all elements of an array), and these operations are widely used in parallel scientific computing.
openaire   +1 more source

Home - About - Disclaimer - Privacy