Results 1 to 10 of about 740 (116)
A Statistical Extension of Byte-Pair Encoding [PDF]
Sub-word segmentation is currently a standard tool for training neural machine translation (MT) systems and other NLP tasks. The goal is to split words (both in the source and target languages) into smaller units which then constitute the input and output vocabularies of the MT system.
David Vilar, Marcello Federico
openaire +1 more source
A Formal Perspective on Byte-Pair Encoding
Byte-Pair Encoding (BPE) is a popular algorithm used for tokenizing data in NLP, despite being devised initially as a compression method. BPE appears to be a greedy algorithm at face value, but the underlying optimization problem that BPE seeks to solve has not yet been laid down. We formalize BPE as a combinatorial optimization problem. Via submodular
Vilém Zouhar +6 more
openaire +4 more sources
Morpheme Embedding for Bahasa Indonesia Using Modified Byte Pair Encoding
Word embedding is an efficient feature representation that carries semantic and syntactic information. Word embedding works as a word level that treats words as minor independent entity units and cannot handle words that are not in the training corpus ...
Amalia Amalia +3 more
doaj +1 more source
Byte Pair Encoding for Symbolic Music
When used with deep learning, the symbolic music modality is often coupled with language model architectures. To do so, the music needs to be tokenized, i.e. converted into a sequence of discrete tokens. This can be achieved by different approaches, as music can be composed of simultaneous tracks, of simultaneous notes with several attributes.
Nathan Fradet +4 more
openaire +2 more sources
Uncovering the neural mechanisms underlying learning from tests. [PDF]
People learn better when re-study opportunities are replaced with tests. While researchers have begun to speculate on why testing is superior to study, few studies have directly examined the neural underpinnings of this effect.
Xiaonan L Liu +3 more
doaj +1 more source
A pair of non-Mendelian genes at the Ga2 locus confer unilateral cross-incompatibility in maize
Unilaterial cross-incompatibility (UCI) systems are regulated by a male-female gene pair that are genetically linked, but no pair of the male and female determinants has been isolated so far.
Zhibin Chen +7 more
doaj +1 more source
Making judgments of learning (JOLs) while studying related word pairs can enhance performance on tests that rely on cue-target associations (e.g., cued recall) compared to studying alone.
Michelle L. Rivers +4 more
doaj +1 more source
Essential Bacterial Functions Encoded by Gene Pairs [PDF]
ABSTRACT To address the need for new antibacterials, a number of bacterial genomes have been systematically disrupted to identify essential genes. Such programs have focused on the disruption of single genes and may have missed functions encoded by gene pairs or multiple genes.
Thomaides HB +7 more
openaire +3 more sources
Immittance spectral pairs (ISP) for speech encoding [PDF]
Immittance spectral pairs (ISPs) form a new set of parameters for representing the linear predictive coding (LPC) filter. For a filter of order n ISP consists of a gain and n-1 frequency parameters, instead of n frequency parameters as is the case for line spectrum pair (LSPs).
Yuval Bistritz, Shlomo Peller
openaire +2 more sources
Codon Pair Bias Is a Direct Consequence of Dinucleotide Bias
Codon pair bias is a remarkably stable characteristic of a species. Although functionally uncharacterized, robust virus attenuation was achieved by recoding of viral proteins using underrepresented codon pairs.
Dusan Kunec, Nikolaus Osterrieder
doaj +1 more source

