Results 221 to 230 of about 5,805,357 (277)
Some of the next articles are maybe not open access.
Training language models to follow instructions with human feedback
Neural Information Processing Systems, 2022Making language models bigger does not inherently make them better at following a user's intent. For example, large language models can generate outputs that are untruthful, toxic, or simply not helpful to the user.
Long Ouyang +19 more
semanticscholar +1 more source
Sigmoid Loss for Language Image Pre-Training
IEEE International Conference on Computer Vision, 2023We propose a simple pairwise sigmoid loss for imagetext pre-training. Unlike standard contrastive learning with softmax normalization, the sigmoid loss operates solely on image-text pairs and does not require a global view of the pairwise similarities ...
Xiaohua Zhai +3 more
semanticscholar +1 more source
Training Compute-Optimal Large Language Models
Advances in Neural Information Processing Systems 35, 2022We investigate the optimal model size and number of tokens for training a transformer language model under a given compute budget. We find that current large language models are significantly undertrained, a consequence of the recent focus on scaling ...
Jordan Hoffmann +21 more
semanticscholar +1 more source
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
North American Chapter of the Association for Computational Linguistics, 2019We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers. Unlike recent language representation models (Peters et al., 2018a; Radford et al., 2018), BERT is designed to pre ...
Jacob Devlin +3 more
semanticscholar +1 more source

