Results 1 to 10 of about 687 (31)
Accelerating Time Series Analysis via Processing using Non-Volatile Memories [PDF]
Time Series Analysis (TSA) is a critical workload for consumer-facing devices. Accelerating TSA is vital for many domains as it enables the extraction of valuable information and predict future events.
Fernandez, Ivan +8 more
core +2 more sources
DAMOV: A New Methodology and Benchmark Suite for Evaluating Data Movement Bottlenecks
Data movement between the CPU and main memory is a first-order obstacle against improving performance, scalability, and energy efficiency in modern systems. Computer systems employ a range of techniques to reduce overheads tied to data movement, spanning
Fernandez, Ivan +7 more
core +1 more source
End-to-end QoS for the open source safety-relevant RISC-V SELENE platform [PDF]
This paper presents the end-to-end QoS approach to provide performance guarantees followed in the SELENEplatform, a high-performance RISC-V based heterogeneous SoC for safety-related real-time systems.
Abella Ferrer, Jaume +10 more
core +1 more source
Modern graphics processing units (GPUs) provide impressive computing resources, which can be accessed conveniently through the CUDA programming interface.
Anderson +28 more
core +1 more source
Accelerating Neural Network Inference with Processing-in-DRAM: From the Edge to the Cloud
Neural networks (NNs) are growing in importance and complexity. A neural network's performance (and energy efficiency) can be bound either by computation or memory resources.
Boroumand, Amirali +4 more
core
An efficient use of virtualization in grid/cloud environments [PDF]
Grid is a hardware and software infrastructure that provides dependable, consistent, pervasive, and inexpensive access to high-end computational resources. Grid enables access to the resources but it does not guarantee any quality of service.
Choudhury, Arindam +4 more
core +1 more source
An Experimental Evaluation of Machine Learning Training on a Real Processing-in-Memory System
Training machine learning (ML) algorithms is a computationally intensive process, which is frequently memory-bound due to repeatedly accessing large training datasets.
Brocard, Sylvan +7 more
core
From classical HPC to deep learning, MatMul is at the heart of today's computing. The recent Maddness method approximates MatMul without the need for multiplication by using a hash-based version of product quantization (PQ) indexing into a look-up table (
Andri, Renzo +4 more
core
We investigate the utility of augmenting a microprocessor with a single execution pipeline by adding a second copy of the execution pipeline in parallel with the existing one.
Desai, Madhav P.
core
SafeTI Traffic Injector Enhancement for Effective Interference Testing in Critical Real-Time Systems
Safety-critical domains, such as automotive, space, and robotics, are adopting increasingly powerful multicores with abundant hardware shared resources for higher performance and efficiency.
Abella, Jaume +3 more
core

