Avx-512 - Open Access .click

Results 11 to 20 of about 53,837 (168)

Transcoding Unicode Characters with AVX-512 Instructions

Software: Practice and Experience, 2022
AbstractIntel includes in its recent processors a powerful set of instructions capable of processing 512‐bit registers with a single instruction (AVX‐512). Some of these instructions have no equivalent in earlier instruction sets. We leverage these instructions to efficiently transcode strings between the most common formats: UTF‐8 and UTF‐16. With our
Daniel Lemire, Robert Clausecker
wiley +5 more sources

Scalability analysis of AVX-512 extensions [PDF]

The Journal of Supercomputing, 2019
Energy efficiency below a specific thermal design power (TDP) has become the main design goal for microprocessors across all market segments. Optimizing the usage of the available transistors within the TDP is a pending topic. Parallelism is the basic foundation for achieving the exascale level.
Juan M. Cebrian, Lasse Natvig, Magnus Jahre +2 more
core +4 more sources

String searching with mismatches using AVX2 and AVX-512 instructions

Information Processing Letters, 2023
zbMATH Open Web Interface contents unavailable due to conflicting licenses.
Tamanna Chhabra, Sukhpal Singh Ghuman, Jorma Tarhio +2 more
openaire +4 more sources

VECTORIZATION OF SMALL-SIZED SPECIAL-TYPE MATRICES MULTIPLICATION USING INSTRUCTIONS AVX-512

Современные информационные технологии и IT-образование, 2018
Modern software packages for supercomputer calculations require a large amount of computing resources. At the same time there are new hardware architectures that open up new opportunities for program code optimizing.
Leonid A. Benderskiy, Alexey A. Rybakov, Sergey S. Shumilin +2 more
doaj +3 more sources

Gem5-AVX: Extension of the Gem5 Simulator to Support AVX Instruction Sets

IEEE Access
Recent commodity x86 CPUs still dominate the majority of supercomputers and most of them implement vector architectures to support single instruction multiple data (SIMD).
Seungmin Lee, Youngsok Kim, Dukyun Nam, Jong Kim +3 more
doaj +2 more sources

Acceleration of Particle Swarm Optimization with AVX Instructions [PDF]

Applied Sciences, 2023
Parallel implementations of algorithms are usually compared with single-core CPU performance. The advantage of multicore vector processors decreases the performance gap between GPU and CPU computation, as shown in many recent pieces of research. With the
Jakub Safarik, Vaclav Snasel
doaj +2 more sources

Implementation of a vectorized Quicksort using AVX-512 intrinsics [PDF]

, 2021
Jahrzehntelang wurden Verbesserungen der Rechengeschwindigkeit erreicht, indem die Taktfrequenz der CPU erhöht wurde. Im Laufe der letzten Jahre wurde dieser Mechanismus durch physikalische Einflüsse gebremst. Daher müssen moderne Single-Thread-Anwendungen stärker CPU-Funktionen ausnutzen, um von den Fortschritten neuer Prozessorgenerationen zu ...
Thiemicke, Frank, Blacher, Mark, Kühne, Lars +2 more
openaire +2 more sources

SeqMatcher: efficient genome sequence matching with AVX-512 extensions [PDF]

The Journal of Supercomputing
Abstract The recent emergence of long-read sequencing technologies has enabled substantial improvements in accuracy and reduced computational costs. Nonetheless, pairwise sequence alignment remains a time-consuming step in common bioinformatics pipelines, becoming a bottleneck in de novo whole-genome assembly.
Elena Espinosa +3 more
openaire +4 more sources

Vectorized Falcon-Sign Implementations using SSE2, AVX2, AVX-512F, NEON, and RVV

Transactions on Cryptographic Hardware and Embedded Systems
Falcon, a NTRU-based digital signature algorithm, has been selected by NIST as one of the post-quantum cryptography (PQC) standards. Compared to verification, the signature generation of Falcon is relatively slow. One of the core operations in signature
Jipeng Zhang, Jiaheng Zhang
doaj +2 more sources

Efficient Parallel Implementations of PIPO Block Cipher on CPU and GPU

IEEE Access, 2022
Data encryption is essential for securely managing clients’ data in servers in data-centric ICT environment. Clients must encrypt the data before transmitting it to severs or other clients. Encrypting a large volumne of data requires a lot of time.
Hojin Choi, Seog Chung Seo
doaj +1 more source

fos: computer and information sciences
simd
vectorization

software optimization
approximate string matching
004

isogeny-based cryptography
avx2
post-quantum cryptography