Results 21 to 30 of about 623,633 (264)
Esteganografía lingüística en lengua española basada en modelo N-gram y ley de Zipf
La esteganografía lingüistica es una ciencia que se aprovecha de la lingüistica computacional para diseñar sistemas útiles en la protección y la privacidad de las comunicaciones digitales y en el marcado digital de textos.
Alfonso Muñoz Muñoz +1 more
doaj +1 more source
The situations where both native and non-native speakers participate in a conversation are called contact situations. In these situations, both native and non-native speakers make verbal behaviour adjustments to achieve a smooth conversation. Adjustments
Dwiky Yoseph Christopher
doaj +1 more source
Precise n-gram Probabilities from Stochastic Context-free Grammars [PDF]
We present an algorithm for computing n-gram probabilities from stochastic context-free grammars, a procedure that can alleviate some of the standard problems associated with n-grams (estimation from sparse data, lack of linguistic structure, among ...
Segal, Jonathan, Stolcke, Andreas
core +4 more sources
Molding CNNs for text: non-linear, non-consecutive convolutions [PDF]
The success of deep learning often derives from well-chosen operational building blocks. In this work, we revise the temporal convolution operation in CNNs to better adapt it to text processing. Instead of concatenating word representations, we appeal to
Barzilay, Regina +2 more
core +1 more source
Implicit n-grams Induced by Recurrence
Although self-attention based models such as Transformers have achieved remarkable successes on natural language processing (NLP) tasks, recent studies reveal that they have limitations on modeling sequential transformations (Hahn, 2020), which may prompt re-examinations of recurrent neural networks (RNNs) that demonstrated impressive results on ...
Sun, Xiaobing, Lu, Wei
openaire +2 more sources
Elucidating the principles of sequence–structure relationships of proteins is a long-standing issue in biology. The nature of a short segment of a protein is determined by both the subsequence of the segment itself and its environment.
Ryohei Kondo +2 more
doaj +1 more source
Bug or Not? Bug Report Classification Using N-Gram IDF [PDF]
Previous studies have found that a significant number of bug reports are misclassified between bugs and non-bugs, and that manually classifying bug reports is a time-consuming task.
Hata, Hideaki +3 more
core +2 more sources
N-gram Boosting: Improving Contextual Biasing with Normalized N-gram Targets
Accurate transcription of proper names and technical terms is particularly important in speech-to-text applications for business conversations. These words, which are essential to understanding the conversation, are often rare and therefore likely to be under-represented in text and audio training data, creating a significant challenge in this domain ...
Li, Wang Yau +7 more
openaire +2 more sources
Unsupervised Learning of Sentence Embeddings using Compositional n-Gram Features [PDF]
The recent tremendous success of unsupervised word embeddings in a multitude of applications raises the obvious question if similar methods could be derived to improve embeddings (i.e. semantic representations) of word sequences as well.
Gupta, Prakhar +2 more
core +3 more sources
Features of Distributional Method for Indonesian Word Clustering
We described the results of a study to determine the best features for algorithm EWSB (Extended Word Similarity Based). EWSB is a word clustering algorithm that can be used for all languages with a common feature.
Herry Sujaini
doaj +1 more source

