Results 21 to 30 of about 947,558 (364)
RT-1: Robotics Transformer for Real-World Control at Scale [PDF]
By transferring knowledge from large, diverse, task-agnostic datasets, modern machine learning models can solve specific downstream tasks either zero-shot or with small task-specific datasets to a high level of performance. While this capability has been
Anthony Brohan +50 more
semanticscholar +1 more source
Uformer: A General U-Shaped Transformer for Image Restoration [PDF]
In this paper, we present Uformer, an effective and efficient Transformer-based architecture for image restoration, in which we build a hierarchical encoder-decoder network using the Transformer block. In Uformer, there are two core designs.
Zhendong Wang +3 more
semanticscholar +1 more source
In this work, we present Point Transformer, a deep neural network that operates directly on unordered and unstructured point sets. We design Point Transformer to extract local and global features and relate both representations by introducing the local ...
Nico Engel +2 more
semanticscholar +1 more source
mT5: A Massively Multilingual Pre-trained Text-to-Text Transformer [PDF]
The recent “Text-to-Text Transfer Transformer” (T5) leveraged a unified text-to-text format and scale to attain state-of-the-art results on a wide variety of English-language NLP tasks.
Linting Xue +7 more
semanticscholar +1 more source
Segmenter: Transformer for Semantic Segmentation [PDF]
Image segmentation is often ambiguous at the level of individual image patches and requires contextual information to reach label consensus. In this paper we introduce Segmenter, a transformer model for semantic segmentation.
Robin Strudel +3 more
semanticscholar +1 more source
Transformer-XL: Attentive Language Models beyond a Fixed-Length Context [PDF]
Transformers have a potential of learning longer-term dependency, but are limited by a fixed-length context in the setting of language modeling. We propose a novel neural architecture Transformer-XL that enables learning dependency beyond a fixed length ...
Zihang Dai +5 more
semanticscholar +1 more source
PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis [PDF]
The most advanced text-to-image (T2I) models require significant training costs (e.g., millions of GPU hours), seriously hindering the fundamental innovation for the AIGC community while increasing CO2 emissions.
Junsong Chen +10 more
semanticscholar +1 more source
CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification [PDF]
The recently developed vision transformer (ViT) has achieved promising results on image classification compared to convolutional neural networks. Inspired by this, in this paper, we study how to learn multi-scale feature representations in transformer ...
Chun-Fu Chen, Quanfu Fan, Rameswar Panda
semanticscholar +1 more source
A Survey on Vision Transformer [PDF]
Transformer, first applied to the field of natural language processing, is a type of deep neural network mainly based on the self-attention mechanism. Thanks to its strong representation capabilities, researchers are looking at ways to apply transformer ...
Kai Han +12 more
semanticscholar +1 more source
RWKV: Reinventing RNNs for the Transformer Era [PDF]
Transformers have revolutionized almost all natural language processing (NLP) tasks but suffer from memory and computational complexity that scales quadratically with sequence length.
Bo Peng +31 more
semanticscholar +1 more source

