Instruction level parallelism - Open Access .click

Results 31 to 40 of about 46,626 (281)

Efficient resources assignment schemes for clustered multithreaded processors [PDF]

, 2008
New feature sizes provide larger number of transistors per chip that architects could use in order to further exploit instruction level parallelism. However, these technologies bring also new challenges that complicate conventional monolithic processor ...
Fernando, Latorre +2 more
core +1 more source

Exploring Various Levels of Parallelism in High-Performance CRC Algorithms

IEEE Access, 2019
Modern processors have increased the capabilities of instruction-level parallelism (ILP) and thread-level parallelism (TLP). These resources, however, typically exhibit poor utilization on conventional cyclic redundancy check (CRC) algorithms.
Mucong Chi, Dazhong He, Jun Liu
doaj +1 more source

A Highly-Efficient and Tightly-Connected Many-Core Overlay Architecture

IEEE Access, 2021
The technology advances of CPU (Central Processing Unit) architecture alternate between generalization and specialization. In the past decade, the general performance has been enhanced while addressing the new brick walls that include power, memory, and ...
Riadh Ben Abdelhamid, Yoshiki Yamaguchi, Taisuke Boku +2 more
doaj +1 more source

Compiler-Directed Parallelism Scaling Framework for Performance Constrained Energy Optimization

IEEE Access, 2020
Evolution of semiconductor manufacturing technology leads to the rising trend of leakage current and the end of Dennard scaling. At the dark silicon era, aggressive power gating scheme with quantitative management on power-gated hardware resources is ...
Yung-Cheng Ma
doaj +1 more source

Design and Implementation of SIMD Unaligned Memory Access Structure [PDF]

Jisuanji gongcheng, 2016
Single Instruction Multiple Data(SIMD) is an effective approach to realize data level parallelism,but accessing unaligned data seriously affects vectorization of the program and causes processor performance degradation.In order to reduce the latency of ...
YU Chenglong,WANG Yongwen
doaj +1 more source

Evaluation Method Based on FPGA Emulation for Resistive Neural Network Accelerators [PDF]

Jisuanji gongcheng, 2021
The Processing-in-Memory(PIM) neural network accelerators based on resistive devices require careful simulation and evaluation during the early stage of architecture design, which ensures the accuracy of neural networks to meet the requirements of design.
SHI Yongquan, JING Naifeng
doaj +1 more source

Loop pipelining with resource and timing constraints [PDF]

, 1996
Developing efficient programs for many of the current parallel computers is not easy due to the architectural complexity of those machines. The wide variety of machine organizations often makes it more difficult to port an existing program than to ...
Sánchez Carracedo, Fermín
core +2 more sources

Boosting Parallel Applications Performance on Applying DIM Technique in a Multiprocessing Environment

International Journal of Reconfigurable Computing, 2011
Limits of instruction-level parallelism and higher transistor density sustain the increasing need for multiprocessor systems: they are rapidly taking over both general-purpose and embedded processor domains.
Mateus B. Rutzig +7 more
doaj +1 more source

User privacy prevention model using supervised federated learning‐based block chain approach for internet of Medical Things

CAAI Transactions on Intelligence Technology, EarlyView., 2023
Abstract This research focuses on addressing the privacy issues in healthcare advancement monitoring with the rapid establishment of the decentralised communication system in the Internet of Medical Things (IoMT). An integrated blockchain homomorphic encryption standard with an in‐build supervised learning‐based smart contract is designed to improvise ...
Chandramohan Dhasarathan +7 more
wiley +1 more source

A case for merging the ILP and DLP paradigms [PDF]

, 1998
The goal of this paper is to show that instruction level parallelism (ILP) and data-level parallelism (DLP) can be merged in a single architecture to execute vectorizable code at a performance level that can not be achieved using either paradigm on its ...
Espasa Sans, Roger +2 more
core +1 more source

parallel computing
computer science
parallelism grammar

instruction-level parallelism
programming language
task parallelism

operating system
computer architecture
data parallelism