Results 41 to 50 of about 82,239 (329)

Implementing implicit OpenMP data sharing on GPUs

open access: yes, 2017
OpenMP is a shared memory programming model which supports the offloading of target regions to accelerators such as NVIDIA GPUs. The implementation in Clang/LLVM aims to deliver a generic GPU compilation toolchain that supports both the native CUDA C/C++
Bataev, Alexey   +8 more
core   +1 more source

AutomataGPT: Transformer‐Based Forecasting and Ruleset Inference for Two‐Dimensional Cellular Automata

open access: yesAdvanced Science, EarlyView.
We introduce AutomataGPT, a generative pretrained transformer (GPT) trained on synthetic spatiotemporal data from 2D cellular automata to learn symbolic rules. Demonstrating strong performance on both forward and inverse tasks, AutomataGPT establishes a scalable, domain‐agnostic framework for interpretable modeling, paving the way for future ...
Jaime A. Berkovich   +2 more
wiley   +1 more source

Benchmarking the cost of thread divergence in CUDA

open access: yes, 2015
All modern processors include a set of vector instructions. While this gives a tremendous boost to the performance, it requires a vectorized code that can take advantage of such instructions.
Bialas, Piotr, Strzelecki, Adam
core   +1 more source

SKOOTS: Skeleton‐Oriented Object Segmentation for Mitochondria in High‐Resolution Cochlear EM Datasets

open access: yesAdvanced Science, EarlyView.
Skeleton‐oriented object segmentation (SKOOTS) introduces a new strategy for 3D mitochondrial instance segmentation by predicting explicit skeletons rather than relying on boundary cues. This approach enables robust analysis of densely packed organelles in large FIB‐SEM datasets.
Christopher J. Buswinka   +3 more
wiley   +1 more source

Instruction-Efficient and Parallelized AES using CUDA and PTX for Data Encryption

open access: yesSistemasi: Jurnal Sistem Informasi
This research presents an instruction-efficient and parallelized implementation of the AES-256 encryption algorithm using NVIDIA CUDA with inline PTX to optimize instruction usage and execution performance on GPUs. Conventional AES implementation on CUDA
Raditya Hakim Daniswara   +1 more
doaj   +1 more source

A Cloud Computing Service Architecture of a Parallel Algorithm Oriented to Scientific Computing with CUDA and Monte Carlo

open access: yesCybernetics and Information Technologies, 2013
The GPGPU (General Purpose Graphics Processing Units) have become a whole new area for research due to the fast development of GPU hardware and programming tools, such as CUDA (Compute Unified Device Architecture).
Yimu Ji   +5 more
doaj   +1 more source

Deep Feature-based Face Detection on Mobile Devices

open access: yes, 2016
We propose a deep feature-based face detector for mobile devices to detect user's face acquired by the front facing camera. The proposed method is able to detect faces in images containing extreme pose and illumination variations as well as partial faces.
Chellappa, Rama   +2 more
core   +1 more source

Long‐Tea‐CLIP: An Expert‐Level Multimodal AI Framework for Fine‐Grained Green Tea Grading Across Five Sensory Dimensions

open access: yesAdvanced Science, EarlyView.
Long‐Tea‐CLIP (Contrastive Language‐Image Pre‐training) presents a multimodal AI framework that integrates visual, metabolomic, and sensory knowledge to grade green tea across appearance, soup color, aroma, taste, and infused leaf. By combining expert‐guided modeling with CLIP‐supervised learning, the system delivers fine‐grained quality evaluation and
Yanqun Xu   +9 more
wiley   +1 more source

Analysis and development tools for efficient programs on parallel architectures

open access: yesТруды Института системного программирования РАН, 2018
The article proposes methods for supporting development of efficient programs for modern parallel architectures, including hybrid systems. Specialized profiling methods designed for programmers tasked with parallelizing existing code are proposed.
Alexander Monakov   +3 more
doaj   +1 more source

Programming GPUs with CUDA [PDF]

open access: yes, 2015
El documento contiene el material de un tutorial impartido en el congreso. No es una artículo científico en formato tradicional.Analizamos las prestaciones y características de las distintas generaciones de procesadores gráficos desarrollados por Nvidia ...
Ujaldon-Martinez, Manuel
core  

Home - About - Disclaimer - Privacy