Results 21 to 30 of about 14,465 (182)

Fast mapping of short sequences with mismatches, insertions and deletions using index structures. [PDF]

open access: yesPLoS Computational Biology, 2009
With few exceptions, current methods for short read mapping make use of simple seed heuristics to speed up the search. Most of the underlying matching models neglect the necessity to allow not only mismatches, but also insertions and deletions.
Steve Hoffmann   +7 more
doaj   +1 more source

Scalable Parallel Suffix Array Construction [PDF]

open access: yesParallel Computing, 2006
Suffix arrays are a simple and powerful data structure for text processing that can be used for full text indexes, data compression, and many other applications in particular in bioinformatics. We describe the first implementation and experimental evaluation of a scalable parallel algorithm for suffix array construction.
Kulla, F., Sanders, P.
openaire   +3 more sources

Accelerated preprocessing in task of searching substrings in a string

open access: yesAdvanced Engineering Research, 2019
Introduction. A rapid development of the systems such as Yandex, Google, etc., has predetermined the relevance of the task of searching substrings in a string, and approaches to its solution are actively investigated. This task is used to create database
A. V. Mazurenko, N. V. Boldyrikhin
doaj   +1 more source

These are not the k-mers you are looking for: efficient online k-mer counting using a probabilistic data structure. [PDF]

open access: yesPLoS ONE, 2014
K-mer abundance analysis is widely used for many purposes in nucleotide sequence analysis, including data preprocessing for de novo assembly, repeat detection, and sequencing coverage estimation.
Qingpeng Zhang   +4 more
doaj   +1 more source

Gclust: A Parallel Clustering Tool for Microbial Genomic Data

open access: yesGenomics, Proteomics & Bioinformatics, 2019
The accelerating growth of the public microbial genomic data imposes substantial burden on the research community that uses such resources. Building databases for non-redundant reference sequences from massive microbial genomic data based on clustering ...
Ruilin Li   +17 more
doaj   +1 more source

Computing Maximal Lyndon Substrings of a String

open access: yesAlgorithms, 2020
There are two reasons to have an efficient algorithm for identifying all right-maximal Lyndon substrings of a string: firstly, Bannai et al. introduced in 2015 a linear algorithm to compute all runs of a string that relies on knowing all right-maximal ...
Frantisek Franek, Michael Liut
doaj   +1 more source

Deterministic sub-linear space LCE data structures with efficient construction [PDF]

open access: yes, 2016
Given a string $S$ of $n$ symbols, a longest common extension query $\mathsf{LCE}(i,j)$ asks for the length of the longest common prefix of the $i$th and $j$th suffixes of $S$. LCE queries have several important applications in string processing, perhaps
Bannai, Hideo   +5 more
core   +2 more sources

Approximate String Matching with Compressed Indexes

open access: yesAlgorithms, 2009
A compressed full-text self-index for a text T is a data structure requiring reduced space and able to search for patterns P in T. It can also reproduce any substring of T, thus actually replacing T. Despite the recent explosion of interest on compressed
Pedro Morales   +3 more
doaj   +1 more source

EERTREE: An Efficient Data Structure for Processing Palindromes in Strings [PDF]

open access: yes, 2015
We propose a new linear-size data structure which provides a fast access to all palindromic substrings of a string or a set of strings. This structure inherits some ideas from the construction of both the suffix trie and suffix tree. Using this structure,
Rubinchik, Mikhail, Shur, Arseny M.
core   +1 more source

Efficient computation of absent words in genomic sequences

open access: yesBMC Bioinformatics, 2008
Background Analysis of sequence composition is a routine task in genome research. Organisms are characterized by their base composition, dinucleotide relative abundance, codon usage, and so on.
Herold Julia   +2 more
doaj   +1 more source

Home - About - Disclaimer - Privacy