Results 21 to 30 of about 629,507 (282)

Esteganografía lingüística en lengua española basada en modelo N-gram y ley de Zipf

open access: yesArbor: Ciencia, Pensamiento y Cultura, 2014
La esteganografía lingüistica es una ciencia que se aprovecha de la lingüistica computacional para diseñar sistemas útiles en la protección y la privacidad de las comunicaciones digitales y en el marcado digital de textos.
Alfonso Muñoz Muñoz   +1 more
doaj   +1 more source

Recursive n-gram hashing is pairwise independent, at best [PDF]

open access: yes, 2010
Many applications use sequences of n consecutive symbols (n-grams). Hashing these n-grams can be a performance bottleneck. For more speed, recursive hash families compute hash values by updating previous values.
Carter   +12 more
core   +2 more sources

LANGUAGE ADJUSTMENT ON A CONTACT SITUATIONS BETWEEN NATIVE SPEAKERS AND ADVANCED JAPANESE LEARNERS: A COMPARISON WITH NATIVE SPEAKERS SITUATIONS (母語話者と上級日本語学習者の接続場面における言語調整:母語場面との比較)

open access: yesJurnal Japanedu: Pendidikan dan Pengajaran Bahasa Jepang, 2018
The situations where both native and non-native speakers participate in a conversation are called contact situations. In these situations, both native and non-native speakers make verbal behaviour adjustments to achieve a smooth conversation. Adjustments
Dwiky Yoseph Christopher
doaj   +1 more source

Unsupervised Learning of Sentence Embeddings using Compositional n-Gram Features [PDF]

open access: yes, 2017
The recent tremendous success of unsupervised word embeddings in a multitude of applications raises the obvious question if similar methods could be derived to improve embeddings (i.e. semantic representations) of word sequences as well.
Gupta, Prakhar   +2 more
core   +3 more sources

Language Modeling with Power Low Rank Ensembles [PDF]

open access: yes, 2014
We present power low rank ensembles (PLRE), a flexible framework for n-gram language modeling where ensembles of low rank matrices and tensors are used to obtain smoothed probability estimates of words in context.
Dyer, Chris   +3 more
core   +2 more sources

Information quantity for secondary structure propensities of protein subsequences in the Protein Data Bank

open access: yesBiophysics and Physicobiology, 2022
Elucidating the principles of sequence–structure relationships of proteins is a long-standing issue in biology. The nature of a short segment of a protein is determined by both the subsequence of the segment itself and its environment.
Ryohei Kondo   +2 more
doaj   +1 more source

Implicit n-grams Induced by Recurrence

open access: yesProceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022
Although self-attention based models such as Transformers have achieved remarkable successes on natural language processing (NLP) tasks, recent studies reveal that they have limitations on modeling sequential transformations (Hahn, 2020), which may prompt re-examinations of recurrent neural networks (RNNs) that demonstrated impressive results on ...
Sun, Xiaobing, Lu, Wei
openaire   +2 more sources

N-gram Boosting: Improving Contextual Biasing with Normalized N-gram Targets

open access: yes, 2023
Accurate transcription of proper names and technical terms is particularly important in speech-to-text applications for business conversations. These words, which are essential to understanding the conversation, are often rare and therefore likely to be under-represented in text and audio training data, creating a significant challenge in this domain ...
Li, Wang Yau   +7 more
openaire   +2 more sources

Auto-Sizing Neural Networks: With Applications to n-gram Language Models [PDF]

open access: yes, 2015
Neural networks have been shown to improve performance across a range of natural-language tasks. However, designing and training them can be complicated. Frequently, researchers resort to repeated experimentation to pick optimal settings.
Chiang, David, Murray, Kenton
core   +1 more source

Features of Distributional Method for Indonesian Word Clustering

open access: yesJEPIN (Jurnal Edukasi dan Penelitian Informatika), 2019
We described the results of a study to determine the best features for algorithm EWSB (Extended Word Similarity Based). EWSB is a word clustering algorithm that can be used for all languages with a common feature.
Herry Sujaini
doaj   +1 more source

Home - About - Disclaimer - Privacy