Results 21 to 30 of about 53,837 (168)

Parallel Fourier Space Image Similarity Calculation Based on SIMD [PDF]

open access: yesJisuanji gongcheng, 2021
The existing models for calculation of three-dimensional cryo-Electron Microscope(cryo-EM) reconstruction frequently call the Fourier space-based image similarity algorithm,and the generated high computational overhead slows the running speed of the ...
GUO Yuluo, BIAN Haodong, DONG Runting, TANG Jiahao, WANG Xiaoying, HUANG Jianqiang
doaj   +1 more source

AVX-512 extension to OpenQCD 1.6 [PDF]

open access: yesProceedings of The 36th Annual International Symposium on Lattice Field Theory — PoS(LATTICE2018), 2019
9 pages, 4 figures and 4 tables. Presented at The 36th Annual International Symposium on Lattice Field Theory (Lattice 2018), 22-28 July, 2018, Michigan State University, East Lansing, Michigan ...
Ed Bennett   +3 more
openaire   +2 more sources

Vectorization of Program Code Containing Low Probability Regions in Computational Geometry Problems

open access: yesСовременные информационные технологии и IT-образование, 2022
Improving application performance is an important practical task for supercomputer calculations. Along with parallelization of calculations between cluster nodes (for example, using MPI tools), as well as multithreaded programming (for example, using ...
Alexey Rybakov
doaj   +1 more source

Vectorizing and distributing number‐theoretic transform to count Goldbach partitions on Arm‐based supercomputers

open access: yesConcurrency and Computation: Practice and Experience, Volume 35, Issue 28, 25 December 2023., 2023
Summary In this article, we explore the usage of scalable vector extension (SVE) to vectorize number‐theoretic transforms (NTTs). In particular, we show that 64‐bit modular arithmetic operations, including modular multiplication, can be efficiently implemented with SVE instructions.
Ricardo Jesus   +2 more
wiley   +1 more source

Fine‐grain task‐parallel algorithms for matrix factorizations and inversion on many‐threaded CPUs

open access: yesConcurrency and Computation: Practice and Experience, Volume 35, Issue 27, 10 December 2023., 2023
Abstract We extend a two‐level task partitioning previously applied to the inversion of dense matrices via Gauss–Jordan elimination to the more challenging QR factorization as well as the initial orthogonal reduction to band form found in the singular value decomposition.
Sandra Catalán   +4 more
wiley   +1 more source

Wirelessly interfacing sensor‐equipped implants and MR scanners for improved safety and imaging

open access: yesMagnetic Resonance in Medicine, Volume 90, Issue 6, Page 2608-2626, December 2023., 2023
Abstract Purpose To investigate a novel reduced RF heating method for imaging in the presence of active implanted medical devices (AIMDs) which employs a sensor‐equipped implant that provides wireless feedback. Methods The implant, consisting of a generator case and a lead, measures RF‐induced E$$ E $$‐fields at the implant tip using a simple sensor in
Berk Silemek   +5 more
wiley   +1 more source

A portable C++ library for memory and compute abstraction on multi‐core CPUs and GPUs

open access: yesConcurrency and Computation: Practice and Experience, Volume 35, Issue 25, 15 November 2023., 2023
Abstract We present a C++ library for transparent memory and compute abstraction across CPU and GPU architectures. Our library combines generic data structures like vectors, multi‐dimensional arrays, maps, graphs, and sparse grids with basic generic algorithms like arbitrary‐dimensional convolutions, copying, merging, sorting, prefix sum, reductions ...
Pietro Incardona   +3 more
wiley   +1 more source

StreamPU: A DSEL for high throughput and low latency software‐defined radio on multicore CPUs

open access: yesConcurrency and Computation: Practice and Experience, Volume 35, Issue 23, 25 October 2023., 2023
Summary This article presents a new Domain Specific Embedded Language (DSEL) dedicated to Software‐Defined Radio (SDR). From a set of carefully designed components, it enables to build efficient software digital communication systems, able to take advantage of the parallelism of modern processor architectures, in a straightforward and safe manner for ...
Adrien Cassagne   +5 more
wiley   +1 more source

Energy Efficiency of a New Parallel PIC Code for Numerical Simulation of Plasma Dynamics in Open Trap

open access: yesMathematics, 2022
The generation of energy-efficient parallel scientific codes became very important in the time of carbon footprint reduction. In this paper, we briefly present our latest particle-in-cell code with the results of a numerical simulation of plasma dynamics
Igor Chernykh   +7 more
doaj   +1 more source

Acceleration with long vector architectures: Implementation and evaluation of the FFT kernel on NEC SX‐Aurora and RISC‐V vector extension

open access: yesConcurrency and Computation: Practice and Experience, Volume 35, Issue 20, 10 September 2023., 2023
Summary Novel architectures leveraging long and variable vector lengths like the NEC SX‐Aurora or the vector extension of RISCV are appearing as promising solutions on the supercomputing market. These architectures often require re‐coding of scientific kernels.
Pablo Vizcaino   +3 more
wiley   +1 more source

Home - About - Disclaimer - Privacy