Results 21 to 30 of about 14,384 (164)
Accelerated preprocessing in task of searching substrings in a string
Introduction. A rapid development of the systems such as Yandex, Google, etc., has predetermined the relevance of the task of searching substrings in a string, and approaches to its solution are actively investigated. This task is used to create database
A. V. Mazurenko, N. V. Boldyrikhin
doaj +1 more source
These are not the k-mers you are looking for: efficient online k-mer counting using a probabilistic data structure. [PDF]
K-mer abundance analysis is widely used for many purposes in nucleotide sequence analysis, including data preprocessing for de novo assembly, repeat detection, and sequencing coverage estimation.
Qingpeng Zhang +4 more
doaj +1 more source
On the suitability of suffix arrays for lempel-ziv data compression [PDF]
Lossless compression algorithms of the Lempel-Ziv (LZ) family are widely used nowadays. Regarding time and memory requirements, LZ encoding is much more demanding than decoding.
D. Gusfield +9 more
core +2 more sources
Gclust: A Parallel Clustering Tool for Microbial Genomic Data
The accelerating growth of the public microbial genomic data imposes substantial burden on the research community that uses such resources. Building databases for non-redundant reference sequences from massive microbial genomic data based on clustering ...
Ruilin Li +17 more
doaj +1 more source
Computing Maximal Lyndon Substrings of a String
There are two reasons to have an efficient algorithm for identifying all right-maximal Lyndon substrings of a string: firstly, Bannai et al. introduced in 2015 a linear algorithm to compute all runs of a string that relies on knowing all right-maximal ...
Frantisek Franek, Michael Liut
doaj +1 more source
Compressed Representations of Permutations, and Applications [PDF]
We explore various techniques to compress a permutation $\pi$ over n integers, taking advantage of ordered subsequences in $\pi$, while supporting its application $\pi$(i) and the application of its inverse $\pi^{-1}(i)$ in small time.
Barbay, Jérémy, Navarro, Gonzalo
core +7 more sources
Approximate String Matching with Compressed Indexes
A compressed full-text self-index for a text T is a data structure requiring reduced space and able to search for patterns P in T. It can also reproduce any substring of T, thus actually replacing T. Despite the recent explosion of interest on compressed
Pedro Morales +3 more
doaj +1 more source
Lyndon Array Construction during Burrows-Wheeler Inversion [PDF]
In this paper we present an algorithm to compute the Lyndon array of a string $T$ of length $n$ as a byproduct of the inversion of the Burrows-Wheeler transform of $T$.
Louza, Felipe A. +3 more
core +3 more sources
Deterministic sub-linear space LCE data structures with efficient construction [PDF]
Given a string $S$ of $n$ symbols, a longest common extension query $\mathsf{LCE}(i,j)$ asks for the length of the longest common prefix of the $i$th and $j$th suffixes of $S$. LCE queries have several important applications in string processing, perhaps
Bannai, Hideo +5 more
core +2 more sources
EERTREE: An Efficient Data Structure for Processing Palindromes in Strings [PDF]
We propose a new linear-size data structure which provides a fast access to all palindromic substrings of a string or a set of strings. This structure inherits some ideas from the construction of both the suffix trie and suffix tree. Using this structure,
Rubinchik, Mikhail, Shur, Arseny M.
core +1 more source

