Results 41 to 50 of about 3,227,937 (329)
A Fast Causal Profiler for Task Parallel Programs
This paper proposes TASKPROF, a profiler that identifies parallelism bottlenecks in task parallel programs. It leverages the structure of a task parallel execution to perform fine-grained attribution of work to various parts of the program.
Nagarakatte, Santosh, Yoga, Adarsh
core +1 more source
GPU Acceleration of Melody Accurate Matching in Query-by-Humming
With the increasing scale of the melody database, the query-by-humming system faces the trade-offs between response speed and retrieval accuracy. Melody accurate matching is the key factor to restrict the response speed.
Limin Xiao +4 more
doaj +1 more source
Performance monitoring and analysis of task-based OpenMP. [PDF]
OpenMP, a typical shared memory programming paradigm, has been extensively applied in high performance computing community due to the popularity of multicore architectures in recent years.
Yi Ding, Kai Hu, Kai Wu, Zhenlong Zhao
doaj +1 more source
Porting Decision Tree Algorithms to Multicore using FastFlow [PDF]
The whole computer hardware industry embraced multicores. For these machines, the extreme optimisation of sequential algorithms is no longer sufficient to squeeze the real machine power, which can be only exploited via thread-level parallelism.
A.C. Sodan +17 more
core +5 more sources
A Scheduling Method of Moldable Parallel Tasks Considering Speedup and System Load on the Cloud
The moldable parallel task (MPT) is a kind of parallel task that their sub-tasks hold the resources exclusively, which has been widely used in different areas. Our paper focuses on the scheduling of moldable tasks when every sub-task supports time-slice.
Jianmin Li, Ying Zhong, Xin Zhang
doaj +1 more source
Timing Analysis for DAG-based and GFP Scheduled Tasks [PDF]
Modern embedded systems have made the transition from single-core to multi-core architectures, providing performance improvement via parallelism rather than higher clock frequencies. DAGs are considered among the most generic task models in the real-time
Marinho, José, Petters, Stefan M.
core
Reservation-Based Federated Scheduling for Parallel Real-Time Tasks
This paper considers the scheduling of parallel real-time tasks with arbitrary-deadlines. Each job of a parallel task is described as a directed acyclic graph (DAG).
Agrawal, Kunal +4 more
core +1 more source
Task decomposition using pattern distributor [PDF]
In this paper, we propose a new task decomposition method for multilayered feedforward neural networks, namely Task Decomposition with Pattern Distributor in order to shorten the training time and improve the generalization accuracy of a network under ...
Bao, C, Guan, SU, Neo, T
core +1 more source
An all‐in‐one analog AI accelerator is presented, enabling on‐chip training, weight retention, and long‐term inference acceleration. It leverages a BEOL‐integrated CMO/HfOx ReRAM array with low‐voltage operation (<1.5 V), multi‐bit capability over 32 states, low programming noise (10 nS), and near‐ideal weight transfer.
Donato Francesco Falcone +11 more
wiley +1 more source
A Pipeline-Based ODE Solving Framework
The traditional parallel solving methods of ordinary differential equations (ODE) are mainly classified into task-parallelism, data-parallelism, and instruction-level parallelism.
Ruixia Cao, Shangjun Hou, Lin Ma
doaj +1 more source

