Efficient tree-traversals: reconciling parallelism and dense data representations [PDF]
Recent work showed that compiling functional programs to use dense, serialized memory representations for recursive algebraic datatypes can yield significant constant-factor speedups for sequential programs.
Chaitanya Koparkar+4 more
semanticscholar +1 more source
Achieving new SQL query performance levels through parallel execution in SQL Server [PDF]
This article provides an in-depth look at implementing parallel SQL query processing using the Microsoft SQL Server database management system. It examines how parallelism can significantly accelerate query execution by leveraging multi-core processors ...
Nuriev Marat+3 more
doaj +1 more source
Parallel Optimization Method of Unstructured-grid Computing in CFD for DomesticHeterogeneous Many-core Architecture [PDF]
Sunway TaihuLight ranked first in the global supercomputer top 500 list 2016-2018 with a peak performance of 125.4 PFlops.Its computing power is mainly attributed to the domestic SW26010 many-core RISC processor.CFD unstructured-grid computing has always
CHEN Xin, LI Fang, DING Hai-xin, SUN Wei-ze, LIU Xin, CHEN De-xun, YE Yue-jin, HE Xiang
doaj +1 more source
Communication Optimization Schemes for Accelerating Distributed Deep Learning Systems
In a distributed deep learning system, a parameter server and workers must communicate to exchange gradients and parameters, and the communication cost increases as the number of workers increases.
Jaehwan Lee+4 more
doaj +1 more source
Novel VLSI Architectures and Micro-Cell Libraries for Subscalar Computations
Parallelism is the key to enhancing the throughput of computing structures. However, it is well established that the presence of data-flow dependencies adversely impacts the exploitation of such parallelism. This paper presents a case for a new computing
Kumar Sambhav Pandey, Hitesh Shrimali
doaj +1 more source
Distributed Machine Learning Using Data Parallelism on Mobile Platform
Machine learning has many challenges, and one of them is to deal with large datasets, because the size of them grows continuously year by year. One solution to this problem is data parallelism. This paper investigates the expansion of data parallelism to
M. Szabó
semanticscholar +1 more source
Parallel Efficient Data Loading [PDF]
In this paper we discuss how we architected and developed a parallel data loader for LeanXcale database. The loader is characterized for its efficiency and parallelism. LeanXcale can scale up and scale out to very large numbers and loading data in the traditional way it is not exploiting its full potential in terms of the loading rate it can reach. For
Jiménez-Peris, Ricardo+5 more
openaire +3 more sources
Integrating parallelism and asynchrony for high-performance software development [PDF]
This article delves into the crucial roles of parallelism and asynchrony in the development of high-performance software programs. It provides an insightful exploration into how these methodologies enhance computing systems' efficiency and performance ...
Zaripova Rimma+2 more
doaj +1 more source
SingleCaffe: An Efficient Framework for Deep Learning on a Single Node
Deep learning (DL) is currently the most promising approach in complicated applications such as computer vision and natural language processing. It thrives with large neural networks and large datasets.
Chenxu Wang+5 more
doaj +1 more source
Towards accelerating model parallelism in distributed deep learning systems.
Modern deep neural networks cannot be often trained on a single GPU due to large model size and large data size. Model parallelism splits a model for multiple GPUs, but making it scalable and seamless is challenging due to different information sharing ...
Hyeonseong Choi+3 more
doaj +1 more source