Avx2 - Open Access .click

Results 141 to 150 of about 597 (176)

AVX2-optimized Kvazaar HEVC intra encoder

2016 IEEE International Conference on Image Processing (ICIP), 2016
This paper presents efficient SIMD optimizations for the open-source Kvazaar HEVC intra encoder. The C implementation of Kvazaar is accelerated by Intel AVX2 instructions whose effect on Kvazaar ultrafast preset is profiled. According to our profiling results, C functions of SATD, DCT, quantization, and intra prediction account for over 60% of the ...
Ari Lemmetti, Ari Koivula, Marko Viitanen +2 more
exaly +4 more sources

Tailored AVX2 Transform Kernels for Versatile Video Coding

2023 IEEE Nordic Circuits and Systems Conference (NorCAS), 2023
Peer ...
Joose Sainio, Alexandre Mercat, Jarno Vanne +2 more
exaly +4 more sources

Some of the next articles are maybe not open access.

Related searches:

post-quantum cryptography
simd
nist pqc

ntru
gpu
saber

oil and vinegar
fos: computer and information sciences
single instruction multiple data simd

Optimizing Dilithium Implementation with AVX2/-512

Transactions on Embedded Computing Systems
Dilithium is a signature scheme that is currently being standardized to the Module-Lattice-Based Digital Signature Standard by NIST. It is believed to be secure even against attacks from large-scale quantum computers based on lattice problems. The implementation efficiency is important for promoting the migration of current cryptography algorithms to ...
Runqing Xu, Debiao He, Min Luo
exaly +2 more sources

Accelerating stereo vision algorithm using SSE3, AVX2, and CUDA

2017 Iranian Conference on Electrical Engineering (ICEE), 2017
Stereo vision features a widespread usage such as robotics, unmanned cars, aerial surveys, and many real-time applications. Also, it needs computational expensive calculations because of stereo matching. In real time applications, the execution time of stereo vision depth detection algorithm is very important.
M. Kokhazadeh +3 more
openaire +2 more sources

Fair Scheduling for AVX2 and AVX-512 Workloads.

, 2021
CPU schedulers such as the Linux Completely Fair Scheduler try to allocate equal shares of the CPU performance to tasks of equal priority by allocating equal CPU time as a technique to improve quality of service for individual tasks. Recently, CPUs have, however, become power-limited to the point where different subsets of the instruction set allow for
Gottschlag, Mathias +3 more
openaire +2 more sources

Research on Accelerating the Performance of SpMV Based on AVX2 Instruction Set

Proceedings of the 2020 4th High Performance Computing and Cluster Technologies Conference & 2020 3rd International Conference on Big Data and Artificial Intelligence, 2020
SpMV (Sparse Matrix-Vector Multiplication) has been widely used in various computing fields. Of course, the requirements for its performance also increase with the increase in the amount of data. Under the CPU processor, we can bring SpMV computing high-performance speed increase through multi-threaded programming. Besides, for CPU processors with SIMD
Haodong Bian, Jianqiang Huang
exaly +2 more sources

High performance implementation of 2-D convolution using AVX2

2017 19th International Symposium on Computer Architecture and Digital Systems (CADS), 2017
Convolution is the most important and fundamental concept in multimedia processing. The 2-D convolution is used for different filtering operations such as sharpening, smoothing, and edge detection. It performs many mathematical operations on all image pixels. Therefore, it is almost a compute-intensive kernel.
Hossein Amiri, Asadollah Shahbahrami
exaly +2 more sources

AVX2 Programming – Extended Instructions

, 2018
In this chapter, you learn how to use some of the instruction set extensions that were introduced in Chapter 8. The first section contains a couple of source code examples that exemplify use of the scalar and packed fused-multiply-add (FMA) instructions. The second section covers instructions that involve the general-purpose registers.
Daniel Kusswurm
openaire +2 more sources

On Improving the Speedup of Slice and Tile Level Parallelism in HEVC Using AVX2

Proceedings of the 21st Pan-Hellenic Conference on Informatics, 2017
HEVC has emerged as the new video coding standard promising improved compression ratios (for the same quality) by up to 50% compared to H.264/AVC. To achieve this performance HEVC requires increased computational overhead compared to its predecessor. For this reason parallelism is used, usually at a coarse grained level, e.g., per slice or tile.
Dimitris Skoumpourdis +5 more
openaire +2 more sources

Fast Implementation of Simeck Family Block Ciphers Using AVX2

2018 International Conference on Platform Technology and Service (PlatCon), 2018
In CHES 2015, the Simeck light-weight family block cipher was proposed, which has similar architecture to SIMON and SPECK. Previous works on implementation of Simeck family lightweight block cipher are focused on the embedded device environment. In this paper, we proposed the fast implementation methods of Simeck family lightweight block ciphers by ...
Taehwan Park, Hwajeong Seo, Howon Kim
exaly +2 more sources

post-quantum cryptography
simd
nist pqc

ntru
gpu
saber

oil and vinegar
fos: computer and information sciences
single instruction multiple data simd