Results 121 to 130 of about 573,488 (144)
Some of the next articles are maybe not open access.

MemPol: Policing Core Memory Bandwidth from Outside of the Cores

IEEE Real Time Technology and Applications Symposium, 2023
In today’s multiprocessor systems-on-a-chip (MP- SoC), the shared memory subsystem is a known source of temporal interference. The problem causes logically independent cores to affect each other’s performance, leading to pessimistic worstcase execution ...
Alexander Zuepke   +4 more
semanticscholar   +1 more source

A 7-nm Four-Core Mixed-Precision AI Chip With 26.2-TFLOPS Hybrid-FP8 Training, 104.9-TOPS INT4 Inference, and Workload-Aware Throttling

IEEE Journal of Solid-State Circuits, 2022
Reduced precision computation is a key enabling factor for energy-efficient acceleration of deep learning (DL) applications. This article presents a 7-nm four-core mixed-precision artificial intelligence (AI) chip that supports four compute precisions ...
Sae Kyu Lee   +43 more
semanticscholar   +1 more source

A 7nm 4-Core AI Chip with 25.6TFLOPS Hybrid FP8 Training, 102.4TOPS INT4 Inference and Workload-Aware Throttling

IEEE International Solid-State Circuits Conference, 2021
Low-precision computation is the key enabling factor to achieve high compute densities (T0PS/W and T0PS/mm2) in AI hardware accelerators across cloud and edge platforms.
A. Agrawal   +43 more
semanticscholar   +1 more source

Fine-Grained QoS Control via Tightly-Coupled Bandwidth Monitoring and Regulation for FPGA-based Heterogeneous SoCs

Design Automation Conference, 2023
Embedded systems are increasingly adopting heterogeneous templates integrating hardware accelerators and application-specific processors, which poses novel challenges.
Gianluca Brilli   +6 more
semanticscholar   +1 more source

Glint: Decentralized Federated Graph Learning with Traffic Throttling and Flow Scheduling

International Workshop on Quality of Service, 2021
Federated learning has been proposed as a promising distributed machine learning paradigm with strong privacy protection on training data. Existing work mainly focuses on training convolutional neural network (CNN) models good at learning on image/voice ...
Tao Liu, Pengjie Li, Yu-Lei Gu
semanticscholar   +1 more source

LIBRA: Clearing the Cloud Through Dynamic Memory Bandwidth Management

International Symposium on High-Performance Computer Architecture, 2021
Modern Cloud Service Providers (CSP) heavily co-schedule tasks with different priorities on the same computing node to increase server utilization. To ensure the performance of high priority jobs, CSPs usually employ Quality-of-Service (QoS) mechanisms ...
Ying Zhang   +9 more
semanticscholar   +1 more source

Test Time Reduction with Data Throttling Techniques in a Multi Core SoC Design

International Symposium on VLSI Design and Test
Test time and test volume reduction have been a serious concern in 7nm or lower technology nodes. In the era of AI, SoC sizes and number of cores used in the SoC ensures Moore’s law in the Semiconductor industry for the unforeseeable future. There are no
Jatin Chakravarti, Chintan Panchal
semanticscholar   +1 more source

Coherence-Aided Memory Bandwidth Regulation

IEEE Real-Time Systems Symposium
With the increasing adoption of PS-PL (Processor System-Programmable Logic) platforms, also known as CPU+FPGA systems, there arises a need for efficient resource management strategies.
Ivan Izhbirdeev   +6 more
semanticscholar   +1 more source

Per-Bank Bandwidth Regulation of Shared Last-Level Cache for Real-Time Systems

IEEE Real-Time Systems Symposium
Modern commercial-off-the-shelf (COTS) multicore processors have advanced memory hierarchies that enhance memory-level parallelism (MLP), which is crucial for high performance.
Connor Sullivan   +3 more
semanticscholar   +1 more source

Near-side prefetch throttling: adaptive prefetching for high-performance many-core processors

International Conference on Parallel Architectures and Compilation Techniques, 2018
In modern processors, prefetching is an essential component for hiding long-latency memory accesses. However, prefetching too aggressively can easily degrade performance by evicting useful data from cache, or by saturating precious memory bandwidth ...
W. Heirman   +4 more
semanticscholar   +1 more source

Home - About - Disclaimer - Privacy