Results 11 to 20 of about 15,915,444 (393)

PaLI-X: On Scaling up a Multilingual Vision and Language Model [PDF]

open access: yesarXiv.org, 2023
We present the training recipe and results of scaling up PaLI-X, a multilingual vision and language model, both in terms of size of the components and the breadth of its training task mixture.
Xi Chen   +42 more
semanticscholar   +1 more source

Scaling Up Your Kernels to 31×31: Revisiting Large Kernel Design in CNNs [PDF]

open access: yesComputer Vision and Pattern Recognition, 2022
We revisit large kernel design in modern convolutional neural networks (CNNs). Inspired by recent advances in vision transformers (ViTs), in this paper, we demonstrate that using a few large convolutional kernels instead of a stack of small kernels could
Xiaohan Ding   +5 more
semanticscholar   +1 more source

Scaling Up and Distilling Down: Language-Guided Robot Skill Acquisition [PDF]

open access: yesConference on Robot Learning, 2023
We present a framework for robot skill acquisition, which 1) efficiently scale up data generation of language-labelled robot data and 2) effectively distills this data down into a robust multi-task language-conditioned visuo-motor policy. For (1), we use
Huy Ha, Peter R. Florence, Shuran Song
semanticscholar   +1 more source

SMPLer-X: Scaling Up Expressive Human Pose and Shape Estimation [PDF]

open access: yesNeural Information Processing Systems, 2023
Expressive human pose and shape estimation (EHPS) unifies body, hands, and face motion capture with numerous applications. Despite encouraging progress, current state-of-the-art methods still depend largely on a confined set of training datasets. In this
Zhongang Cai   +12 more
semanticscholar   +1 more source

SVIT: Scaling up Visual Instruction Tuning [PDF]

open access: yesarXiv.org, 2023
Thanks to the emerging of foundation models, the large language and vision models are integrated to acquire the multimodal ability of visual captioning, question answering, etc. Although existing multimodal models present impressive performance of visual
Bo Zhao, Boya Wu, Tiejun Huang
semanticscholar   +1 more source

Scaling Up Dataset Distillation to ImageNet-1K with Constant Memory [PDF]

open access: yesInternational Conference on Machine Learning, 2022
Dataset Distillation is a newly emerging area that aims to distill large datasets into much smaller and highly informative synthetic ones to accelerate training and reduce storage.
Justin Cui   +3 more
semanticscholar   +1 more source

Scaling Up Models and Data with t5x and seqio [PDF]

open access: yesJournal of machine learning research, 2022
Recent neural network-based language models have benefited greatly from scaling up the size of training datasets and the number of parameters in the models themselves.
Adam Roberts   +42 more
semanticscholar   +1 more source

More ConvNets in the 2020s: Scaling up Kernels Beyond 51x51 using Sparsity [PDF]

open access: yesInternational Conference on Learning Representations, 2022
Transformers have quickly shined in the computer vision world since the emergence of Vision Transformers (ViTs). The dominant role of convolutional neural networks (CNNs) seems to be challenged by increasingly effective transformer-based models.
Shiwei Liu   +8 more
semanticscholar   +1 more source

VeLO: Training Versatile Learned Optimizers by Scaling Up [PDF]

open access: yesarXiv.org, 2022
While deep learning models have replaced hand-designed features across many domains, these models are still trained with hand-designed optimizers. In this work, we leverage the same scaling approach behind the success of deep learning to learn versatile ...
Luke Metz   +10 more
semanticscholar   +1 more source

Scaling Up Vision-Language Pretraining for Image Captioning [PDF]

open access: yesComputer Vision and Pattern Recognition, 2021
In recent years, we have witnessed significant performance boost in the image captioning task based on vision-language pre-training (VLP). Scale is believed to be an important factor for this advance.
Xiaowei Hu   +6 more
semanticscholar   +1 more source

Home - About - Disclaimer - Privacy