BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models [PDF]
The cost of vision-and-language pre-training has become increasingly prohibitive due to end-to-end training of large-scale models. This paper proposes BLIP-2, a generic and efficient pre-training strategy that bootstraps vision-language pre-training from
Junnan Li+3 more
semanticscholar +1 more source
Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback [PDF]
We apply preference modeling and reinforcement learning from human feedback (RLHF) to finetune language models to act as helpful and harmless assistants.
Yuntao Bai+30 more
semanticscholar +1 more source
Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling [PDF]
How do large language models (LLMs) develop and evolve over the course of training? How do these patterns change as models scale? To answer these questions, we introduce \textit{Pythia}, a suite of 16 LLMs all trained on public data seen in the exact ...
Stella Biderman+12 more
semanticscholar +1 more source
An Empirical Study of Training Self-Supervised Vision Transformers [PDF]
This paper does not describe a novel method. Instead, it studies a straightforward, incremental, yet must-know baseline given the recent progress in computer vision: self-supervised learning for Vision Transformers (ViT).
Xinlei Chen, Saining Xie, Kaiming He
semanticscholar +1 more source
Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet [PDF]
Transformers, which are popular for language modeling, have been explored for solving vision tasks recently, e.g., the Vision Transformer (ViT) for image classification. The ViT model splits each image into a sequence of tokens with fixed length and then
Li Yuan+7 more
semanticscholar +1 more source
Jailbroken: How Does LLM Safety Training Fail? [PDF]
Large language models trained for safety and harmlessness remain susceptible to adversarial misuse, as evidenced by the prevalence of"jailbreak"attacks on early releases of ChatGPT that elicit undesired behavior. Going beyond recognition of the issue, we
Alexander Wei+2 more
semanticscholar +1 more source
Vocational Teacher Productivity in Palembang: Education Production Function [PDF]
In education sector the direct estimates of worker productivity are available for the majority of the workforce. In recent years, educational economists examine productivity returns to work experience among teachers using predicted contributions to ...
Evi Oktavia
doaj +1 more source
VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training [PDF]
Pre-training video transformers on extra large-scale datasets is generally required to achieve premier performance on relatively small datasets. In this paper, we show that video masked autoencoders (VideoMAE) are data-efficient learners for self ...
Zhan Tong+3 more
semanticscholar +1 more source
Grounded Language-Image Pre-training [PDF]
This paper presents a grounded language-image pretraining (GLIP) model for learning object-level, language-aware, and semantic-rich visual representations. GLIP unifies object detection and phrase grounding for pre-training.
Liunian Harold Li+11 more
semanticscholar +1 more source
The effect of Mindfulness on Attention and Comprehension in Children with Specific Learning Disability with Impairment in Reading [PDF]
The current research aimed to study the effectiveness of mindfulness training on attention and comprehension in children with a specific learning disability with impairment in Reading.
Shahrooz Nemati+2 more
doaj +1 more source