Suffix arrays - Open Access .click

Results 41 to 50 of about 14,384 (164)

Accurate long read mapping using enhanced suffix arrays [PDF]

, 2010
With the rise of high throughput sequencing, new programs have been developed for dealing with the alignment of a huge amount of short read data to reference genomes.
Dawyndt, Peter +4 more
core +1 more source

Universal Compressed Text Indexing [PDF]

, 2018
The rise of repetitive datasets has lately generated a lot of interest in compressed self-indexes based on dictionary compression, a rich and heterogeneous family that exploits text repetitions in different ways. For each such compression scheme, several
Navarro, Gonzalo, Prezza, Nicola
core +2 more sources

Distributed enhanced suffix arrays [PDF]

Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, 2019
Suffix arrays and trees are important and fundamental string data structures which lie at the foundation of many string algorithms, with important applications in computational biology, text processing, and information retrieval. Recent work enables the efficient parallel construction of suffix arrays and trees requiring at most O(n/p) memory per ...
Patrick Flick, Srinivas Aluru
openaire +1 more source

Geoseq: a tool for dissecting deep-sequencing datasets

BMC Bioinformatics, 2010
Background Datasets generated on deep-sequencing platforms have been deposited in various public repositories such as the Gene Expression Omnibus (GEO), Sequence Read Archive (SRA) hosted by the NCBI, or the DNA Data Bank of Japan (ddbj).
Homann Robert +6 more
doaj +1 more source

Speeding up index construction with GPU for DNA data sequences [PDF]

, 2011
The advancement of technology in scientific community has produced terabytes of biological data.This datum includes DNA sequences.String matching algorithm which is traditionally used to match DNA sequences now takes much longer time to execute because ...
Abdul Rashid, Nur’aini, Rahmaddiansyah, , +1 more
core

On Maximal Unbordered Factors [PDF]

, 2015
Given a string $S$ of length $n$, its maximal unbordered factor is the longest factor which does not have a border. In this work we investigate the relationship between $n$ and the length of the maximal unbordered factor of $S$.
A Ehrenfeucht +11 more
core +5 more sources

String Comparison in $V$-Order: New Lexicographic Properties & On-line Applications [PDF]

, 2015
$V$-order is a global order on strings related to Unique Maximal Factorization Families (UMFFs), which are themselves generalizations of Lyndon words. $V$-order has recently been proposed as an alternative to lexicographical order in the computation of ...
Alatabbi, Ali +3 more
core +1 more source

Handling Massive N-Gram Datasets Efficiently [PDF]

, 2018
This paper deals with the two fundamental problems concerning the handling of large n-gram language models: indexing, that is compressing the n-gram strings and associated satellite data without compromising their retrieval speed; and estimation, that is
Pibiri, Giulio Ermanno, Venturini, Rossano +1 more
core +3 more sources

A new method to compute K-mer frequencies and its application to annotate large repetitive plant genomes

BMC Genomics, 2008
Background The challenges of accurate gene prediction and enumeration are further aggravated in large genomes that contain highly repetitive transposable elements (TEs).
Narechania Apurva +3 more
doaj +1 more source

Finding patterns in strings using suffix arrays [PDF]

, 2010
Finding regularities in large data sets requires implementations of systems that are efﬁcient in both time and space requirements. Here, we describe a newly developed system that exploits the internal structure of the enhanced sufﬁxarray to ﬁnd ...
Stehouwer, H., Van Zaanen, M.
core

suffix array
data structures
suffix tree

pattern matching
string algorithms
004

theoretical computer science
medical informatics
computer applications to medicine