Results 11 to 20 of about 623,633 (264)

Comparing neural- and N-gram-based language models for word segmentation. [PDF]

open access: yesJ Assoc Inf Sci Technol, 2019
Word segmentation is the task of inserting or deleting word boundary characters in order to separate character sequences that correspond to words in some language.
Doval Y, Gómez-Rodríguez C.
europepmc   +4 more sources

Variable word rate N-grams [PDF]

open access: yes2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100), 2002
4 pages, 4 figures, ICASSP ...
Gotoh, Yoshihiko, Renals, Steve
openaire   +4 more sources

Stemmer and phonotactic rules to improve n-gram tagger-based indonesian phonemicization

open access: yesJournal of King Saud University: Computer and Information Sciences, 2022
A phonemicization or grapheme-to-phoneme conversion (G2P) is a process of converting a word into its pronunciation. It is one of the essential components in speech synthesis, speech recognition, and natural language processing.
Suyanto Suyanto   +4 more
doaj   +1 more source

Small world of the miRNA science drives its publication dynamics

open access: yesВавиловский журнал генетики и селекции, 2023
Many scientific articles became available in the digital form which allows for querying articles data, and specifically the automated metadata gathering, which includes the affiliation data.
A. B. Firsov, I. I. Titov
doaj   +1 more source

Computing n-gram statistics in MapReduce [PDF]

open access: yesProceedings of the 16th International Conference on Extending Database Technology, 2013
Statistics about n-grams (i.e., sequences of contiguous words or other tokens in text documents or other string data) are an important building block in information retrieval and natural language processing. In this work, we study how n-gram statistics, optionally restricted by a maximum n-gram length and minimum collection frequency, can be computed ...
Berberich, K., Bedathur, S.
openaire   +4 more sources

Handling Massive N-Gram Datasets Efficiently [PDF]

open access: yes, 2018
This paper deals with the two fundamental problems concerning the handling of large n-gram language models: indexing, that is compressing the n-gram strings and associated satellite data without compromising their retrieval speed; and estimation, that is
Pibiri, Giulio Ermanno   +1 more
core   +3 more sources

Fast Structural Alignment of Biomolecules Using a Hash Table, N-Grams and String Descriptors

open access: yesAlgorithms, 2009
This work presents a generalized approach for the fast structural alignment of thousands of macromolecular structures. The method uses string representations of a macromolecular structure and a hash table that stores n-grams of a certain size for ...
Robert Preissner   +6 more
doaj   +1 more source

Human assessments of document similarity [PDF]

open access: yes, 2010
Two studies are reported that examined the reliability of human assessments of document similarity and the association between human ratings and the results of n-gram automatic text analysis (ATA).
Belkin   +28 more
core   +1 more source

Detection of Algorithmically Generated Malicious Domain Names with Feature Fusion of Meaningful Word Segmentation and N-Gram Sequences

open access: yesApplied Sciences, 2023
Domain generation algorithms (DGAs) play an important role in network attacks and can be mainly divided into two types: dictionary-based and character-based.
Shaojie Chen   +3 more
doaj   +1 more source

Recursive n-gram hashing is pairwise independent, at best [PDF]

open access: yes, 2010
Many applications use sequences of n consecutive symbols (n-grams). Hashing these n-grams can be a performance bottleneck. For more speed, recursive hash families compute hash values by updating previous values.
Carter   +12 more
core   +2 more sources

Home - About - Disclaimer - Privacy