Results 61 to 70 of about 21,831,698 (267)
Distributed training of large language models: A survey
The emergence of large language models (LLMs) such as ChatGPT has opened up groundbreaking possibilities, enabling a wide range of applications in diverse fields, including healthcare, law, and education.
Fanlong Zeng +3 more
doaj +1 more source
Efficient electro-magnetic analysis of a GPU bitsliced AES implementation
The advent of CUDA-enabled GPU makes it possible to provide cloud applications with high-performance data security services. Unfortunately, recent studies have shown that GPU-based applications are also susceptible to side-channel attacks.
Yiwen Gao, Yongbin Zhou, Wei Cheng
doaj +1 more source
BISMO: A Scalable Bit-Serial Matrix Multiplication Overlay for Reconfigurable Computing
Matrix-matrix multiplication is a key computational kernel for numerous applications in science and engineering, with ample parallelism and data locality that lends itself well to high-performance implementations.
Rasnayake, Lahiru +2 more
core +1 more source
Demonstrating parallelism in quantitative laboratory tests is crucial to ensure accurate reporting of data and minimise risks to patients. Regulatory authorities make the demonstration of parallelism before clinical use approval mandate.
Axel Petzold
doaj +1 more source
Strategies and Principles of Distributed Machine Learning on Big Data
The rise of big data has led to new demands for machine learning (ML) systems to learn complex models, with millions to billions of parameters, that promise adequate capacity to digest massive datasets and offer powerful predictive analytics (such as ...
Eric P. Xing +3 more
doaj +1 more source
Design of Network on Chip Routing Units Based on Virtual Conflict Array [PDF]
Network on Chip(NoC) routing units share the input buffers and only allow sequential access to data,which limits the speed and efficiency of communication on chip.To improve the parallelism of NoC,a routing units architecture based on virtual conflict ...
YANG Tianhao, SUN Jin
doaj +1 more source
To define and identify a region-of-interest (ROI) in a digital image, the shape descriptor of the ROI has to be described in terms of its boundary characteristics. To address the generic issues of contour tracking, the yConvex Hypergraph (yCHG) model was
Agarwal, Tejaswi +2 more
core +1 more source
A snapshot of parallelism in distributed deep learning training
The accelerated development of applications related to artificial intelligence has generated the creation of increasingly complex neural network models with enormous amounts of parameters, currently reaching up to trillions of parameters.
Hairol Romero-Sandí +2 more
doaj +1 more source
A Logical Model and Data Placement Strategies for MEMS Storage Devices
MEMS storage devices are new non-volatile secondary storages that have outstanding advantages over magnetic disks. MEMS storage devices, however, are much different from magnetic disks in the structure and access characteristics.
Kim, Min-Soo +3 more
core +1 more source
Pipeline Parallelism With Elastic Averaging
To accelerate the training speed of massive DNN models on large-scale datasets, distributed training techniques, including data parallelism and model parallelism, have been extensively studied.
Bongwon Jang, In-Chul Yoo, Dongsuk Yook
doaj +1 more source

