Results 121 to 130 of about 573,488 (144)
Some of the next articles are maybe not open access.
MemPol: Policing Core Memory Bandwidth from Outside of the Cores
IEEE Real Time Technology and Applications Symposium, 2023In today’s multiprocessor systems-on-a-chip (MP- SoC), the shared memory subsystem is a known source of temporal interference. The problem causes logically independent cores to affect each other’s performance, leading to pessimistic worstcase execution ...
Alexander Zuepke+4 more
semanticscholar +1 more source
IEEE Journal of Solid-State Circuits, 2022
Reduced precision computation is a key enabling factor for energy-efficient acceleration of deep learning (DL) applications. This article presents a 7-nm four-core mixed-precision artificial intelligence (AI) chip that supports four compute precisions ...
Sae Kyu Lee+43 more
semanticscholar +1 more source
Reduced precision computation is a key enabling factor for energy-efficient acceleration of deep learning (DL) applications. This article presents a 7-nm four-core mixed-precision artificial intelligence (AI) chip that supports four compute precisions ...
Sae Kyu Lee+43 more
semanticscholar +1 more source
IEEE International Solid-State Circuits Conference, 2021
Low-precision computation is the key enabling factor to achieve high compute densities (T0PS/W and T0PS/mm2) in AI hardware accelerators across cloud and edge platforms.
A. Agrawal+43 more
semanticscholar +1 more source
Low-precision computation is the key enabling factor to achieve high compute densities (T0PS/W and T0PS/mm2) in AI hardware accelerators across cloud and edge platforms.
A. Agrawal+43 more
semanticscholar +1 more source
Design Automation Conference, 2023
Embedded systems are increasingly adopting heterogeneous templates integrating hardware accelerators and application-specific processors, which poses novel challenges.
Gianluca Brilli+6 more
semanticscholar +1 more source
Embedded systems are increasingly adopting heterogeneous templates integrating hardware accelerators and application-specific processors, which poses novel challenges.
Gianluca Brilli+6 more
semanticscholar +1 more source
Glint: Decentralized Federated Graph Learning with Traffic Throttling and Flow Scheduling
International Workshop on Quality of Service, 2021Federated learning has been proposed as a promising distributed machine learning paradigm with strong privacy protection on training data. Existing work mainly focuses on training convolutional neural network (CNN) models good at learning on image/voice ...
Tao Liu, Pengjie Li, Yu-Lei Gu
semanticscholar +1 more source
LIBRA: Clearing the Cloud Through Dynamic Memory Bandwidth Management
International Symposium on High-Performance Computer Architecture, 2021Modern Cloud Service Providers (CSP) heavily co-schedule tasks with different priorities on the same computing node to increase server utilization. To ensure the performance of high priority jobs, CSPs usually employ Quality-of-Service (QoS) mechanisms ...
Ying Zhang+9 more
semanticscholar +1 more source
Test Time Reduction with Data Throttling Techniques in a Multi Core SoC Design
International Symposium on VLSI Design and TestTest time and test volume reduction have been a serious concern in 7nm or lower technology nodes. In the era of AI, SoC sizes and number of cores used in the SoC ensures Moore’s law in the Semiconductor industry for the unforeseeable future. There are no
Jatin Chakravarti, Chintan Panchal
semanticscholar +1 more source
Coherence-Aided Memory Bandwidth Regulation
IEEE Real-Time Systems SymposiumWith the increasing adoption of PS-PL (Processor System-Programmable Logic) platforms, also known as CPU+FPGA systems, there arises a need for efficient resource management strategies.
Ivan Izhbirdeev+6 more
semanticscholar +1 more source
Per-Bank Bandwidth Regulation of Shared Last-Level Cache for Real-Time Systems
IEEE Real-Time Systems SymposiumModern commercial-off-the-shelf (COTS) multicore processors have advanced memory hierarchies that enhance memory-level parallelism (MLP), which is crucial for high performance.
Connor Sullivan+3 more
semanticscholar +1 more source
Near-side prefetch throttling: adaptive prefetching for high-performance many-core processors
International Conference on Parallel Architectures and Compilation Techniques, 2018In modern processors, prefetching is an essential component for hiding long-latency memory accesses. However, prefetching too aggressively can easily degrade performance by evicting useful data from cache, or by saturating precious memory bandwidth ...
W. Heirman+4 more
semanticscholar +1 more source