Results 311 to 320 of about 500,674 (362)

Speculative Decoding with Big Little Decoder

open access: yesAdvances in Neural Information Processing Systems 36, 2023
The recent emergence of Large Language Models based on the Transformer architecture has enabled dramatic advancements in the field of Natural Language Processing.
Sehoon Kim   +6 more
semanticscholar   +4 more sources
Some of the next articles are maybe not open access.

Related searches:

Medical Image Segmentation via Cascaded Attention Decoding

IEEE Workshop/Winter Conference on Applications of Computer Vision, 2023
Transformers have shown great promise in medical image segmentation due to their ability to capture long-range dependencies through self-attention. However, they lack the ability to learn the local (contextual) relations among pixels.
M. Rahman, R. Marculescu
semanticscholar   +1 more source

Decoder malfunction in BCH decoders

IEEE Transactions on Information Theory, 1990
A t-error-correcting bounded-distance decoder either produces the codeword nearest the received vector (if there is a codeword at distance no more than t) or indicates that no such codeword exists. However, BCH decoders based on the Peterson-Gorenstein-Zierler algorithm or the Euclidean algorithm can malfunction and produce output vectors that are not ...
Dilip V. Sarwate, Robert D. Morrison
openaire   +1 more source

On α-decodability and α-likelihood decoder

2017 55th Annual Allerton Conference on Communication, Control, and Computing (Allerton), 2017
Generalizing the maximum and the average error criteria for channel coding, we introduce the α-decodability, defined as the α-norm of the probabilities of correctly decoding the messages. Several aspects, such as the exponent, the existence of a strong Fano's inequality, and the achievability of the channel capacity by random coding are investigated ...
Jingbo Liu, Paul Cuff, Sergio Verdú
openaire   +1 more source

Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding

arXiv.org
Diffusion-based large language models (Diffusion LLMs) have shown promise for non-autoregressive text generation with parallel decoding capabilities. However, the practical inference speed of open-sourced Diffusion LLMs often lags behind autoregressive ...
Chengyue Wu   +8 more
semanticscholar   +1 more source

Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads

International Conference on Machine Learning
Large Language Models (LLMs) employ auto-regressive decoding that requires sequential computation, with each step reliant on the previous one's output.
Tianle Cai   +6 more
semanticscholar   +1 more source

DistServe: Disaggregating Prefill and Decoding for Goodput-optimized Large Language Model Serving

USENIX Symposium on Operating Systems Design and Implementation
DistServe improves the performance of large language models (LLMs) serving by disaggregating the prefill and decoding computation. Existing LLM serving systems colocate the two phases and batch the computation of prefill and decoding across all users and
Yinmin Zhong   +7 more
semanticscholar   +1 more source

Unlocking Efficiency in Large Language Model Inference: A Comprehensive Survey of Speculative Decoding

Annual Meeting of the Association for Computational Linguistics
To mitigate the high inference latency stemming from autoregressive decoding in Large Language Models (LLMs), Speculative Decoding has emerged as a novel decoding paradigm for LLM inference.
Heming Xia   +8 more
semanticscholar   +1 more source

COLD Decoding: Energy-based Constrained Text Generation with Langevin Dynamics

Neural Information Processing Systems, 2022
Many applications of text generation require incorporating different constraints to control the semantics or style of generated text. These constraints can be hard (e.g., ensuring certain keywords are included in the output) and soft (e.g ...
Lianhui Qin   +3 more
semanticscholar   +1 more source

Home - About - Disclaimer - Privacy