Results 201 to 210 of about 62,688 (226)
Some of the next articles are maybe not open access.

Selective GPU caches to eliminate CPU-GPU HW cache coherence

2016 IEEE International Symposium on High Performance Computer Architecture (HPCA), 2016
Cache coherence is ubiquitous in shared memory multiprocessors because it provides a simple, high performance memory abstraction to programmers. Recent work suggests extending hardware cache coherence between CPUs and GPUs to help support programming models with tightly coordinated sharing between CPU and GPU threads.
Neha Agarwal   +5 more
openaire   +1 more source

Accelerating Concurrent Workloads with CPU Cache Partitioning

2018 IEEE 34th International Conference on Data Engineering (ICDE), 2018
Modern microprocessors include a sophisticated hierarchy of caches to hide the latency of memory access and thereby speed up data processing. However, multiple cores within a processor usually share the same last-level cache. This can hurt performance, especially in concurrent workloads whenever a query suffers from cache pollution caused by another ...
Stefan Noll   +3 more
openaire   +1 more source

Functional implementation techniques for CPU cache memories

IEEE Transactions on Computers, 1999
As the performance gap between processors and main memory continues to widen, increasingly aggressive implementations of cache memories are needed to bridge the gap. In this paper, we consider some of the issues that are involved in the implementation of highly optimized cache memories and survey the techniques that can be used to help achieve the ...
null Jih-Kwon Peir, W.W. Hsu, A.J. Smith
openaire   +1 more source

CPU cache prefetching: Timing evaluation of hardware implementations

IEEE Transactions on Computers, 1998
Prefetching into CPU caches has long been known to be effective in reducing the cache miss ratio, but known implementations of prefetching have been unsuccessful in improving CPU performance. The reasons for this are that prefetches interfere with normal cache operations by making cache address and data ports busy, the memory bus busy, the memory banks
J. Tse, A.J. Smith
openaire   +1 more source

Dynamic CPU cache management under the loop model

Proceedings of Southcon '95, 2002
Concerns cache designs using replacement strategies to provide the high performance required by fast CPUs. We propose a new replacement technique that uses some heuristic to detect loop structures in the reference patterns. Initially, the proposed technique uses the least recently used (LRU) strategy.
C. Jaouhar, I. Mahgoub, R. Hewett
openaire   +1 more source

Accelerate Your Graphic Program with GPU/CPU Cache

2008 International Conference on Cyberworlds, 2008
This paper discusses how to optimize the digital graphic program with cache system used in GPU/CPU architecture to gain more FPS. Firstly, we introduce the basic principle of cache system summarily; secondly, we discuss the three main organization and mapping technologies of cache system in detail, and then compare these three cache mapping solutions ...
Zhou Likun, Chen Dingfang
openaire   +1 more source

Extending a CPU Cache for Efficient IPv6 Lookup

2018 IEEE 61st International Midwest Symposium on Circuits and Systems (MWSCAS), 2018
Increasing throughput requirements for Internet routers and growing routing table sizes have emphasized the need for fast and scalable packet forwarding systems. This paper presents a hardware cache-based IPv6 lookup system. Our goal is to study how much performance can be achieved with a lookup system that is implemented by modifying a processor cache.
Benjamin Wolff   +3 more
openaire   +2 more sources

Cache-efficient parallel eikonal solver for multicore CPUs

Computational Geosciences, 2018
zbMATH Open Web Interface contents unavailable due to conflicting licenses.
Alexandr A. Nikitin   +2 more
openaire   +2 more sources

Line (Block) Size Choice for CPU Cache Memories

IEEE Transactions on Computers, 1987
The line (block) size of a cache memory is one of the parameters that most strongly affects cache performance. In this paper, we study the factors that relate to the selection of a cache line size. Our primary focus is on the cache miss ratio, but we also consider influences such as logic complexity, address tags, line crossers, I/O overruns, etc.
openaire   +2 more sources

Interrupt Triggered Software Prefetching for Embedded CPU Instruction Cache

12th IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS'06), 2006
In embedded systems, handling time-critical real-time tasks is a challenge. The software may not only multi-task to improve response time, but also support events and interrupts, forcing the system to balance multiple priorities. Further, pre-emptive task switching hampers efficient interrupt processing, leading to instruction cache misses.
K.W. Batcher, R.A. Walker
openaire   +1 more source

Home - About - Disclaimer - Privacy