Results 81 to 90 of about 710,513 (145)
Panel: parallel computing in the undergraduate computer science curriculum [PDF]
Nan C. Schaller
openalex +1 more source
I-structures: data structures for parallel computing
Arvind Arvind+2 more
openalex +2 more sources
Method for determining the acceleration of a parallel specialised computer system based on Amdahl's law [PDF]
The modification of Amdahl's law for the case of increment of processor elements in a computer system is considered. The coefficient $k$ linking accelerations of parallel and parallel specialized computer systems is determined. The limiting values of the coefficient are investigated and its theoretical maximum is calculated.
arxiv
Visualizing parallel simulations in network computing environments [PDF]
Christopher D. Carothers+4 more
openalex +1 more source
Accelerating Large Language Model Training with 4D Parallelism and Memory Consumption Estimator [PDF]
In large language model (LLM) training, several parallelization strategies, including Tensor Parallelism (TP), Pipeline Parallelism (PP), Data Parallelism (DP), as well as Sequence Parallelism (SP) and Context Parallelism (CP), are employed to distribute model parameters, activations, and optimizer states across devices.
arxiv
xDiT: an Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism [PDF]
Diffusion models are pivotal for generating high-quality images and videos. Inspired by the success of OpenAI's Sora, the backbone of diffusion models is evolving from U-Net to Transformer, known as Diffusion Transformers (DiTs). However, generating high-quality content necessitates longer sequence lengths, exponentially increasing the computation ...
arxiv