Results 131 to 140 of about 53,837 (168)
Some of the next articles are maybe not open access.
Automatic Core Specialization for AVX-512 Applications
Proceedings of the 13th ACM International Systems and Storage Conference, 2020Advanced Vector Extension (AVX) instructions operate on wide SIMD vectors. Due to the resulting high power consumption, recent Intel processors reduce their frequency when executing complex AVX2 and AVX-512 instructions. Following non-AVX code is slowed down by this frequency reduction in two situations: When it executes on the sibling hyperthread of ...
Frank Bellosa
exaly +3 more sources
Fair Scheduling for AVX2 and AVX-512 Workloads.
CPU schedulers such as the Linux Completely Fair Scheduler try to allocate equal shares of the CPU performance to tasks of equal priority by allocating equal CPU time as a technique to improve quality of service for individual tasks. Recently, CPUs have, however, become power-limited to the point where different subsets of the instruction set allow for
Gottschlag, Mathias +3 more
openaire +2 more sources
Acceleration of Large Integer Multiplication with Intel AVX-512 Instructions
2018 IEEE 20th International Conference on High Performance Computing and Communications; IEEE 16th International Conference on Smart City; IEEE 4th International Conference on Data Science and Systems (HPCC/SmartCity/DSS), 2018In this paper, we propose an implementation of large integer multiplication using Single Instruction Multiple Data (SIMD) instructions. We evaluated the implementation on an Intel Xeon Phi processor. The second generation Intel Xeon Phi processor, Knights Landing, has a set of Advanced Vector Extensions-512 (AVX-512) instructions.
Takuya Edamatsu +1 more
exaly +2 more sources
Hadamard Transform Improvement for HEVC using Intel AVX-512 [PDF]
High Efficiency Video Coding (HEVC) doubles the data compression ratio compared to previous generation compression technology, Moving Picture Expert Group-Advanced Video Codec (MPEG-AVC/H.264) without sacrificing the image quality. However, this superior compression comes at the cost of more computation payload resulting in longer time for encoding and
Jackson Teh Ka Sing +3 more
openaire +2 more sources
An implementation of matrix–matrix multiplication on the Intel KNL processor with AVX-512
Cluster Computing, 2018The second generation Intel Xeon Phi processor codenamed Knights Landing (KNL) have recently emerged with 2D tile mesh architecture and the Intel AVX-512 instructions. However, it is very difficult for general users to get the maximum performance from the new architecture since they are not familiar with optimal cache reuse, efficient vectorization ...
Roktaek Lim +2 more
exaly +2 more sources
Fast Multiple-Precision Integer Division Using Intel AVX-512
IEEE Transactions on Emerging Topics in Computing, 2023Takuya Edamatsu +1 more
exaly +2 more sources
Experiments on Speeding Up the Recursive Fast Fourier Transform by Using AVX-512 SIMD Instructions
The Fast Fourier Transform is probably one of the most studied algorithms of all time. New techniques regarding hardware and software are often applied and tested on it, but the interest in FFT is still large because of its applications - signal and ...
Marco Cococcioni, Cococcioni Marco
exaly +2 more sources
A new AXT format for an efficient SpMV product using AVX-512 instructions and CUDA [PDF]
The Sparse Matrix-Vector (SpMV) product is a key operation used in many scientific applications. This work proposes a new sparse matrix storage scheme, the AXT format, that improves the SpMV performance on vector capability platforms.
E Coronado-Barrientos +2 more
exaly +2 more sources
An Implementation of Parallel Number-Theoretic Transform Using Intel AVX-512 Instructions
Lecture Notes in Computer Science, 2022Daisuke Takahashi +1 more
exaly +2 more sources

