Results 11 to 20 of about 10,485,189 (358)
How to Fine-Tune BERT for Text Classification?
Language model pre-training has proven to be useful in learning universal language representations. As a state-of-the-art language model pre-training model, BERT (Bidirectional Encoder Representations from Transformers) has achieved amazing results in ...
Huang, Xuanjing +3 more
core +2 more sources
Spanish Pre-trained BERT Model and Evaluation Data [PDF]
The Spanish language is one of the top 5 spoken languages in the world. Nevertheless, finding resources to train or evaluate Spanish language models is not an easy task.
J. Cañete +5 more
semanticscholar +1 more source
ET-BERT: A Contextualized Datagram Representation with Pre-training Transformers for Encrypted Traffic Classification [PDF]
Encrypted traffic classification requires discriminative and robust traffic representation captured from content-invisible and imbalanced traffic data for accurate classification, which is challenging but indispensable to achieve network security and ...
Xinjie Lin +5 more
semanticscholar +1 more source
Point-BERT: Pre-training 3D Point Cloud Transformers with Masked Point Modeling [PDF]
We present Point-BERT, a new paradigm for learning Transformers to generalize the concept of BERT [8] to 3D point cloud. Inspired by BERT, we devise a Masked Point Modeling (MPM) task to pre-train point cloud Transformers. Specifically, we first divide a
Xumin Yu +5 more
semanticscholar +1 more source
Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks [PDF]
BERT (Devlin et al., 2018) and RoBERTa (Liu et al., 2019) has set a new state-of-the-art performance on sentence-pair regression tasks like semantic textual similarity (STS). However, it requires that both sentences are fed into the network, which causes
Nils Reimers, Iryna Gurevych
semanticscholar +1 more source
A Comprehensive Survey on Pretrained Foundation Models: A History from BERT to ChatGPT [PDF]
Pretrained Foundation Models (PFMs) are regarded as the foundation for various downstream tasks with different data modalities. A PFM (e.g., BERT, ChatGPT, and GPT-4) is trained on large-scale data which provides a reasonable parameter initialization for
Ce Zhou +26 more
semanticscholar +1 more source
Can ChatGPT Understand Too? A Comparative Study on ChatGPT and Fine-tuned BERT [PDF]
Recently, ChatGPT has attracted great attention, as it can generate fluent and high-quality responses to human inquiries. Several prior studies have shown that ChatGPT attains remarkable generation ability compared with existing models.
Qihuang Zhong +4 more
semanticscholar +1 more source
TinyBERT: Distilling BERT for Natural Language Understanding [PDF]
Language model pre-training, such as BERT, has significantly improved the performances of many natural language processing tasks. However, pre-trained language models are usually computationally expensive, so it is difficult to efficiently execute them ...
Xiaoqi Jiao +7 more
semanticscholar +1 more source
What Does BERT Look at? An Analysis of BERT’s Attention [PDF]
Large pre-trained neural networks such as BERT have had great recent success in NLP, motivating a growing body of research investigating what aspects of language they are able to learn from unlabeled data.
Kevin Clark +3 more
semanticscholar +1 more source
How Multilingual is Multilingual BERT? [PDF]
In this paper, we show that Multilingual BERT (M-BERT), released by Devlin et al. (2018) as a single language model pre-trained from monolingual corpora in 104 languages, is surprisingly good at zero-shot cross-lingual model transfer, in which task ...
Telmo Pires, Eva Schlinger, Dan Garrette
semanticscholar +1 more source

