Results 11 to 20 of about 687 (31)
Retrospective: A Scalable Processing-in-Memory Accelerator for Parallel Graph Processing
Our ISCA 2015 paper provides a new programmable processing-in-memory (PIM) architecture and system design that can accelerate key data-intensive applications, with a focus on graph processing workloads. Our major idea was to completely rethink the system,
Ahn, Junwhan +4 more
core
QoS Driven Coordinated Management of Resources to Save Energy in Multi-Core Systems [PDF]
Reducing the energy consumption of computing systems is a necessary endeavor. However, saving energy should not come at the expense of degrading user experience.
Nejat, Mehrzad
core
Dans le but de satisfaire les différentes contraintes matérielles, une exploration architecturale peut permettre de définir les paramètres optimaux d'un processeur VLIW (Very Long Instruction Word) pour une application donnée tels que le nombre d'unités ...
Yviquel, Hervé
core
Benchmarking a New Paradigm: An Experimental Analysis of a Real Processing-in-Memory Architecture
Many modern workloads, such as neural networks, databases, and graph processing, are fundamentally memory-bound. For such workloads, the data movement between main memory and CPU cores imposes a significant overhead in terms of both latency and energy. A
Fernandez, Ivan +5 more
core
The number and diversity of consumer devices are growing rapidly, alongside their target applications' memory consumption. Unfortunately, DRAM scalability is becoming a limiting factor to the available memory capacity in consumer devices.
Boroumand, Amirali +10 more
core
A Prior Study of Split Compilation and Approximate Floating-Point Computations
From a future perspective of heterogeneous multicore processors, we studied several optimization processes specified for floating-point numbers in this internship.
Taguchi, Takanoki
core
TransPimLib: A Library for Efficient Transcendental Functions on Processing-in-Memory Systems
Processing-in-memory (PIM) promises to alleviate the data movement bottleneck in modern computing systems. However, current real-world PIM systems have the inherent disadvantage that their hardware is more constrained than in conventional processors (CPU,
Guo, Yuxin +5 more
core
Hypersparse Traffic Matrix Construction using GraphBLAS on a DPU
Low-power small form factor data processing units (DPUs) enable offloading and acceleration of a broad range of networking and security services. DPUs have accelerated the transition to programmable networking by enabling the replacement of FPGAs/ASICs ...
Amariucai, George +12 more
core
Inter-Layer Scheduling Space Exploration for Multi-model Inference on Heterogeneous Chiplets
To address increasing compute demand from recent multi-model workloads with heavy models like large language models, we propose to deploy heterogeneous chiplet-based multi-chip module (MCM)-based accelerators.
Faruque, Mohammad Abdullah Al +2 more
core
Processing-using-DRAM (PUD) architectures impose a restrictive data layout and alignment for their operands, where source and destination operands (i) must reside in the same DRAM subarray (i.e., a group of DRAM rows sharing the same row buffer and row ...
Esposito, Emanuele G. +3 more
core

