Results 51 to 60 of about 263,258 (207)

Fast Query Processing by Distributing an Index over CPU Caches [PDF]

open access: yes2005 IEEE International Conference on Cluster Computing, 2005
New version published at IEEE Cluster Computing ...
Xiaoqin Ma, Gene Cooperman
openaire   +2 more sources

Converting an Integer to a Decimal String in Under Two Nanoseconds

open access: yesSoftware: Practice and Experience, EarlyView.
ABSTRACT Objective Converting binary integers to variable‐length decimal strings is a fundamental operation in computing. Conventional fast approaches rely on recursive division and small lookup tables. The goal of this work is to develop a significantly faster method for this task.
Jaël Champagne Gareau, Daniel Lemire
wiley   +1 more source

Highly efficient parallel direct solver for solving dense complex matrix equations from method of moments

open access: yesThe Journal of Engineering, 2017
Based on the vectorised and cache optimised kernel, a parallel lower upper decomposition with a novel communication avoiding pivoting scheme is developed to solve dense complex matrix equations generated by the method of moments.
Yan Chen   +4 more
doaj   +1 more source

Casper: Accelerating Stencil Computations Using Near-Cache Processing

open access: yesIEEE Access, 2023
Stencil computations are commonly used in a wide variety of scientific applications, ranging from large-scale weather prediction to solving partial differential equations.
Alain Denzler   +6 more
doaj   +1 more source

Harnessing Integrated CPU-GPU System Memory for HPC: a first look into Grace Hopper [PDF]

open access: yesInternational Conference on Parallel Processing
Memory management across discrete CPU and GPU physical memory is traditionally achieved through explicit GPU allocations and data copy or unified virtual memory.
Gabin Schieffer   +4 more
semanticscholar   +1 more source

Multitype Game Optimisation: A Two‐Stage Fine‐Tuning Framework for Multi‑Game Optimisation With Large Language Models

open access: yesCAAI Transactions on Intelligence Technology, EarlyView.
ABSTRACT Large language models (LLMs) have made remarkable advances in natural language processing, demonstrating great potential in modelling structured sequences. However, adapting these capabilities to machine gaming tasks such as Go remains challenging due to limitations in strategy generalisation and optimisation efficiency.
Xiali Li   +5 more
wiley   +1 more source

A Simple Cache Emulator for Evaluating Cache Behavior for SMP Systems

open access: yesActa Polytechnica, 2006
Every modern CPU uses a complex memory hierarchy, which consists of multiple cache memory levels. It is very difficult to predict the behavior of this hierarchy for a given program (for details see [1, 2]).
I. Šimeček
doaj  

Cache-optimized BFS on multi-core CPUs

open access: yesProceedings of the 1st FastCode Programming Challenge
Breadth-First Search (BFS) performance on shared-memory systems is often limited by irregular memory access and cache inefficiencies. This work presents two optimizations for BFS graph traversal: a bitmap-based algorithm designed for small-diameter graphs and MergedCSR, a graph storage format that improves cache locality for large-scale graphs ...
Salvatore Domenico Andaloro   +2 more
openaire   +1 more source

GEMM-ArchProfiler: A simulation framework for hardware-level profiling and performance analysis of General Matrix Multiplication in real CNN workloads on heterogeneous CPU architectures

open access: yesSoftwareX
In this paper, the authors present GEMM-ArchProfiler, a simulation framework for evaluating General Matrix Multiplication performance in convolutional neural networks.
Binu Ayyappan, G. Santhosh Kumar
doaj   +1 more source

NuMagSANS: a GPU‐accelerated open‐source software package for the generic computation of nuclear and magnetic small‐angle neutron scattering observables of complex systems

open access: yesJournal of Applied Crystallography, EarlyView.
NuMagSANS, a GPU‐accelerated software package for the computation of nuclear and magnetic small‐angle neutron scattering cross sections and correlation functions of complex systems, is presented.We present NuMagSANS, a GPU‐accelerated software package for calculating nuclear and magnetic small‐angle neutron scattering (SANS) cross sections and ...
Michael P. Adams, Andreas Michels
wiley   +1 more source

Home - About - Disclaimer - Privacy