Results 31 to 40 of about 53,837 (168)

Fast matrix multiplication via compiler‐only layered data reorganization and intrinsic lowering

open access: yesSoftware: Practice and Experience, Volume 53, Issue 9, Page 1793-1814, September 2023., 2023
Abstract The resurgence of machine learning has increased the demand for high‐performance basic linear algebra subroutines (BLAS), which have long depended on libraries to achieve peak performance on commodity hardware. High‐performance BLAS implementations rely on a layered approach that consists of tiling and packing layers—for data (re)organization ...
Braedy Kuzma   +6 more
wiley   +1 more source

Applying AVX512 vectorization to improve the performance of a random number generator

open access: yesТруды Института системного программирования РАН, 2018
The generation of uniformly distributed random numbers is necessary for computer simulation by Monte Carlo methods and molecular dynamics. Generators of pseudo-random numbers (GPRS) are used to generate random numbers.
M. S. Guskova   +2 more
doaj   +1 more source

Спосіб ентропійного кодування відео на базі розширеного набору інструкцій SIMD AVX-512

open access: yes, 2022
The purpose of this work is to reduce the time of entropy coding of video using the capabilities of processors with an extended instruction set of the AVX-512 type due to parallelization and the use of additional SIMD instructions compared to AVX2 and ...
Русанова, О.В.   +1 more
core   +1 more source

ПРОГРАМНА BITSLICED-ІМПЛЕМЕНТАЦІЯ ШИФРУ «КАЛИНА» ОРІЄНТОВАНА НА ВИКОРИСТАННЯ SIMD-ІНСТРУКЦІЙ МІКРОПРОЦЕСОРІВ З АРХІТЕКТУРОЮ Х86-64

open access: yesКібербезпека: освіта, наука, техніка, 2020
Статтю присвячено програмній bitsliced-імплементації шифру «Калина» з використанням векторних інструкцій SSE, AVX, AVX-512 для х86-64 процесорів. Проаналізовано переваги і недоліки різних підходів до ефективної та захищеної програмної реалізації блокових
Yаroslav Sovyn, Volodymyr Khoma
doaj   +1 more source

VECTORIZATION OF OPERATIONS ON SMALL- DIMENSIONAL MATRICES FOR INTEL XEON PHI KNIGHTS LANDING PROCESSOR

open access: yesСовременные информационные технологии и IT-образование, 2018
The article is devoted to the vectorization of calculations for Intel Xeon Phi Knights Landing (KNL) processor. Small-dimensional matrices are considered as objects for optimization. These operations are wide common in calculation codes in various scopes
Leonid A. Benderskiy   +2 more
doaj   +1 more source

Vectorization of CMSSW offline software [PDF]

open access: yesEPJ Web of Conferences
The CMS experiment has been utilizing vectorization, or SIMD, in parts of its data processing applications for over a decade. On x86 platforms the vectorization level is still SSE3. In the past attempts to use wider vector instruction sets such as AVX or
Gartung Patrick
doaj   +1 more source

Experiments on Speeding Up the Recursive Fast Fourier Transform by using AVX-512 SIMD instructions

open access: yes, 2022
The Fast Fourier Transform is probably one of the most studied algorithms of all time. New techniques regarding hardware and software are often applied and tested on it, but the interest in FFT is still large because of its applications - signal and ...
Giacomo Sansone, Marco Cococcioni
core   +1 more source

Optimization of the N-body Simulation on Intel’s Architectures Based on AVX-512 Instruction Set [PDF]

open access: yes, 2020
The N-body simulations have become a powerful tool to test the gravitational interaction among particles, ranging from a few bodies to complete galaxies.
Chichizola, Franco   +7 more
core   +1 more source

Remote AVX Overhead: Detection and Mitigation

open access: yes, 2021
Due to power constraints, recent Intel CPUs reduce their frequency when executing AVX2 and AVX-512 instructions. Often, this frequency reduction affects other applications as well, which reduces overall performance and prevents contemporary operating ...
Gottschlag, Mathias
core   +1 more source

Converting Binary Floating‐Point Numbers to Shortest Decimal Strings: An Experimental Review

open access: yesSoftware: Practice and Experience, Volume 56, Issue 4, Page 462-478, April 2026.
ABSTRACT Background When sharing or logging numerical data, we must convert binary floating‐point numbers into their decimal string representations. For example, the number π might become 3.1415927. Engineers have perfected many algorithms for producing such accurate, short strings.
Jaël Champagne Gareau, Daniel Lemire
wiley   +1 more source

Home - About - Disclaimer - Privacy