Results 21 to 30 of about 126,905 (80)

Mix-GEMM: An efficient HW-SW Architecture for Mixed-Precision Quantized Deep Neural Networks Inference on Edge Devices

open access: yesInternational Symposium on High-Performance Computer Architecture, 2023
Deep Neural Network (DNN) inference based on quantized narrow-precision integer data represents a promising research direction toward efficient deep learning computations on edge and mobile devices.
Enrico Reggiani   +6 more
semanticscholar   +1 more source

An Experimental Evaluation of Machine Learning Training on a Real Processing-in-Memory System

open access: yes, 2023
Training machine learning (ML) algorithms is a computationally intensive process, which is frequently memory-bound due to repeatedly accessing large training datasets.
Brocard, Sylvan   +7 more
core  

An efficient use of virtualization in grid/cloud environments [PDF]

open access: yes, 2011
Grid is a hardware and software infrastructure that provides dependable, consistent, pervasive, and inexpensive access to high-end computational resources. Grid enables access to the resources but it does not guarantee any quality of service.
Choudhury, Arindam   +4 more
core   +1 more source

Lightweight Hardware Implementation of Binary Ring-LWE PQC Accelerator

open access: yesIEEE computer architecture letters, 2022
Significant innovation has been made in the development of public-key cryptography that is able to withstand quantum attacks, known as post-quantum cryptography (PQC).
B. J. Lucas   +9 more
semanticscholar   +1 more source

Stella Nera: Achieving 161 TOp/s/W with Multiplier-free DNN Acceleration based on Approximate Matrix Multiplication

open access: yes, 2023
From classical HPC to deep learning, MatMul is at the heart of today's computing. The recent Maddness method approximates MatMul without the need for multiplication by using a hash-based version of product quantization (PQ) indexing into a look-up table (
Andri, Renzo   +4 more
core  

An evaluation of a microprocessor with two independent hardware execution threads coupled through a shared cache

open access: yes, 2023
We investigate the utility of augmenting a microprocessor with a single execution pipeline by adding a second copy of the execution pipeline in parallel with the existing one.
Desai, Madhav P.
core  

SafeTI Traffic Injector Enhancement for Effective Interference Testing in Critical Real-Time Systems

open access: yes, 2023
Safety-critical domains, such as automotive, space, and robotics, are adopting increasingly powerful multicores with abundant hardware shared resources for higher performance and efficiency.
Abella, Jaume   +3 more
core  

QoS Driven Coordinated Management of Resources to Save Energy in Multi-Core Systems [PDF]

open access: yes, 2019
Reducing the energy consumption of computing systems is a necessary endeavor. However, saving energy should not come at the expense of degrading user experience.
Nejat, Mehrzad
core  

Retrospective: A Scalable Processing-in-Memory Accelerator for Parallel Graph Processing

open access: yes, 2023
Our ISCA 2015 paper provides a new programmable processing-in-memory (PIM) architecture and system design that can accelerate key data-intensive applications, with a focus on graph processing workloads. Our major idea was to completely rethink the system,
Ahn, Junwhan   +4 more
core  

Benchmarking a New Paradigm: An Experimental Analysis of a Real Processing-in-Memory Architecture

open access: yes, 2021
Many modern workloads, such as neural networks, databases, and graph processing, are fundamentally memory-bound. For such workloads, the data movement between main memory and CPU cores imposes a significant overhead in terms of both latency and energy. A
Fernandez, Ivan   +5 more
core  

Home - About - Disclaimer - Privacy