Results 21 to 30 of about 623,633 (264)

Esteganografía lingüística en lengua española basada en modelo N-gram y ley de Zipf

open access: yesArbor: Ciencia, Pensamiento y Cultura, 2014
La esteganografía lingüistica es una ciencia que se aprovecha de la lingüistica computacional para diseñar sistemas útiles en la protección y la privacidad de las comunicaciones digitales y en el marcado digital de textos.
Alfonso Muñoz Muñoz   +1 more
doaj   +1 more source

LANGUAGE ADJUSTMENT ON A CONTACT SITUATIONS BETWEEN NATIVE SPEAKERS AND ADVANCED JAPANESE LEARNERS: A COMPARISON WITH NATIVE SPEAKERS SITUATIONS (母語話者と上級日本語学習者の接続場面における言語調整:母語場面との比較)

open access: yesJurnal Japanedu: Pendidikan dan Pengajaran Bahasa Jepang, 2018
The situations where both native and non-native speakers participate in a conversation are called contact situations. In these situations, both native and non-native speakers make verbal behaviour adjustments to achieve a smooth conversation. Adjustments
Dwiky Yoseph Christopher
doaj   +1 more source

Precise n-gram Probabilities from Stochastic Context-free Grammars [PDF]

open access: yes, 1994
We present an algorithm for computing n-gram probabilities from stochastic context-free grammars, a procedure that can alleviate some of the standard problems associated with n-grams (estimation from sparse data, lack of linguistic structure, among ...
Segal, Jonathan, Stolcke, Andreas
core   +4 more sources

Molding CNNs for text: non-linear, non-consecutive convolutions [PDF]

open access: yes, 2015
The success of deep learning often derives from well-chosen operational building blocks. In this work, we revise the temporal convolution operation in CNNs to better adapt it to text processing. Instead of concatenating word representations, we appeal to
Barzilay, Regina   +2 more
core   +1 more source

Implicit n-grams Induced by Recurrence

open access: yesProceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022
Although self-attention based models such as Transformers have achieved remarkable successes on natural language processing (NLP) tasks, recent studies reveal that they have limitations on modeling sequential transformations (Hahn, 2020), which may prompt re-examinations of recurrent neural networks (RNNs) that demonstrated impressive results on ...
Sun, Xiaobing, Lu, Wei
openaire   +2 more sources

Information quantity for secondary structure propensities of protein subsequences in the Protein Data Bank

open access: yesBiophysics and Physicobiology, 2022
Elucidating the principles of sequence–structure relationships of proteins is a long-standing issue in biology. The nature of a short segment of a protein is determined by both the subsequence of the segment itself and its environment.
Ryohei Kondo   +2 more
doaj   +1 more source

Bug or Not? Bug Report Classification Using N-Gram IDF [PDF]

open access: yes, 2017
Previous studies have found that a significant number of bug reports are misclassified between bugs and non-bugs, and that manually classifying bug reports is a time-consuming task.
Hata, Hideaki   +3 more
core   +2 more sources

N-gram Boosting: Improving Contextual Biasing with Normalized N-gram Targets

open access: yes, 2023
Accurate transcription of proper names and technical terms is particularly important in speech-to-text applications for business conversations. These words, which are essential to understanding the conversation, are often rare and therefore likely to be under-represented in text and audio training data, creating a significant challenge in this domain ...
Li, Wang Yau   +7 more
openaire   +2 more sources

Unsupervised Learning of Sentence Embeddings using Compositional n-Gram Features [PDF]

open access: yes, 2017
The recent tremendous success of unsupervised word embeddings in a multitude of applications raises the obvious question if similar methods could be derived to improve embeddings (i.e. semantic representations) of word sequences as well.
Gupta, Prakhar   +2 more
core   +3 more sources

Features of Distributional Method for Indonesian Word Clustering

open access: yesJEPIN (Jurnal Edukasi dan Penelitian Informatika), 2019
We described the results of a study to determine the best features for algorithm EWSB (Extended Word Similarity Based). EWSB is a word clustering algorithm that can be used for all languages with a common feature.
Herry Sujaini
doaj   +1 more source

Home - About - Disclaimer - Privacy