Results 341 to 350 of about 833,540 (375)
Some of the next articles are maybe not open access.
Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads
International Conference on Machine LearningLarge Language Models (LLMs) employ auto-regressive decoding that requires sequential computation, with each step reliant on the previous one's output.
Tianle Cai +6 more
semanticscholar +1 more source
DistServe: Disaggregating Prefill and Decoding for Goodput-optimized Large Language Model Serving
USENIX Symposium on Operating Systems Design and ImplementationDistServe improves the performance of large language models (LLMs) serving by disaggregating the prefill and decoding computation. Existing LLM serving systems colocate the two phases and batch the computation of prefill and decoding across all users and
Yinmin Zhong +7 more
semanticscholar +1 more source
IEEE Transactions on Information Theory, 1979
Any symbol in a redundant code can be recovered when it belongs to certain erasure patterns. Several alternative expressions of a given symbol, to be referred to as its replicas, can therefore be computed in terms of other ones. Decoding is interpreted as decoding upon a received symbol, given itself and a number of such replicas, expressed in terms of
M. Decouvelaere +2 more
openaire +2 more sources
Any symbol in a redundant code can be recovered when it belongs to certain erasure patterns. Several alternative expressions of a given symbol, to be referred to as its replicas, can therefore be computed in terms of other ones. Decoding is interpreted as decoding upon a received symbol, given itself and a number of such replicas, expressed in terms of
M. Decouvelaere +2 more
openaire +2 more sources
Annual Meeting of the Association for Computational Linguistics
To mitigate the high inference latency stemming from autoregressive decoding in Large Language Models (LLMs), Speculative Decoding has emerged as a novel decoding paradigm for LLM inference.
Heming Xia +8 more
semanticscholar +1 more source
To mitigate the high inference latency stemming from autoregressive decoding in Large Language Models (LLMs), Speculative Decoding has emerged as a novel decoding paradigm for LLM inference.
Heming Xia +8 more
semanticscholar +1 more source
Break the Sequential Dependency of LLM Inference Using Lookahead Decoding
International Conference on Machine LearningAutoregressive decoding of large language models (LLMs) is memory bandwidth bounded, resulting in high latency and significant wastes of the parallel processing power of modern accelerators.
Yichao Fu +3 more
semanticscholar +1 more source
Probability of deficient decoding in sequential decoding [PDF]
The deficient probability is the probability that a sequential decoder fails to decode the received data. For Jelinek's (1968) sequential decoder model, we give an asymptotically tight bound.
openaire +1 more source
On α-decodability and α-likelihood decoder
2017 55th Annual Allerton Conference on Communication, Control, and Computing (Allerton), 2017Generalizing the maximum and the average error criteria for channel coding, we introduce the α-decodability, defined as the α-norm of the probabilities of correctly decoding the messages. Several aspects, such as the exponent, the existence of a strong Fano's inequality, and the achievability of the channel capacity by random coding are investigated ...
Paul Cuff, Sergio Verdu, Jingbo Liu
openaire +1 more source
Speech decoder and a method for decoding speech
The Journal of the Acoustical Society of America, 2009A speech decoder comprises a decoder (103) for converting a linear prediction encoded speech signal into a first sample stream having a first sampling rate and representing a first frequency band. Additionally it comprises a vocoder (105) for converting an input signal into a second sample stream having a second sampling rate and representing a second ...
Jani Rotola-Pukkila +2 more
openaire +2 more sources
A modified HEVC decoder for low power decoding
Proceedings of the 12th ACM International Conference on Computing Frontiers, 2015The increasing prominence of video oriented services, such as video conferencing, streaming or sharing, over other Internet services has made video decoding a must-have feature for any consumer device. As a complex signal processing task, video decoding puts a high pressure on battery-driven devices since battery autonomy becomes a key performance ...
Nogues, Erwan +3 more
openaire +2 more sources
Decoding OvTDM with sphere-decoding algorithm
The Journal of China Universities of Posts and Telecommunications, 2008Overlapped time division multiplexing (OvTDM) is a new type of transmission scheme with high spectrum efficiency and low threshold signal-to-noise ratio (SNR). In this article, the structure of OvTDM is introduced and the sphere-decoding algorithm of complex domain is proposed for OvTDM.
Dao-ben Li, Xin Jin
openaire +2 more sources

