SaAlign: Multiple DNA/RNA sequence alignment and phylogenetic tree construction tool for ultra-large datasets and ultra-long sequences based on suffix array [PDF]
Multiple DNA/RNA sequence alignment is an important fundamental tool in bioinformatics, especially for phylogenetic tree construction. With DNA-sequencing improvements, the amount of bioinformatics data is constantly increasing, and various tools need to
Ziyuan Wang +7 more
doaj +2 more sources
GHOSTX: an improved sequence homology search algorithm using a query suffix array and a database suffix array. [PDF]
DNA sequences are translated into protein coding sequences and then further assigned to protein families in metagenomic analyses, because of the need for sensitivity.
Shuji Suzuki +3 more
doaj +2 more sources
Generalized enhanced suffix array construction in external memory [PDF]
Background Suffix arrays, augmented by additional data structures, allow solving efficiently many string processing problems. The external memory construction of the generalized suffix array for a string collection is a fundamental task when the size of ...
Felipe A. Louza +3 more
doaj +2 more sources
TOPAZ: asymmetric suffix array neighbourhood search for massive protein databases [PDF]
Background Protein homology search is an important, yet time-consuming, step in everything from protein annotation to metagenomics. Its application, however, has become increasingly challenging, due to the exponential growth of protein databases.
Alan Medlar, Liisa Holm
doaj +2 more sources
Compressed Spaced Suffix Arrays [PDF]
Spaced seeds are important tools for similarity search in bioinformatics, and using several seeds together often significantly improves their performance.
Gagie, Travis +2 more
core +8 more sources
Fast, parallel, and cache-friendly suffix array construction [PDF]
Purpose String indexes such as the suffix array (sa) and the closely related longest common prefix (lcp) array are fundamental objects in bioinformatics and have a wide variety of applications.
Jamshed Khan +4 more
doaj +2 more sources
Direct construction of sparse suffix arrays with Libsais [PDF]
Background Pattern matching is a fundamental challenge in bioinformatics, especially in the fields of genomics, transcriptomics and proteomics. Efficient indexing structures, such as suffix arrays, are critical for searching large datasets.
Simon Van de Vyver +4 more
doaj +2 more sources
RNA-Seq mapping and detection of gene fusions with a suffix array algorithm. [PDF]
High-throughput RNA sequencing enables quantification of transcripts (both known and novel), exon/exon junctions and fusions of exons from different genes.
Onur Sakarya +23 more
doaj +2 more sources
mkESA: enhanced suffix array construction tool. [PDF]
Abstract Summary: We introduce the tool mkESA, an open source program for constructing enhanced suffix arrays (ESAs), striving for low memory consumption, yet high practical speed. mkESA is a user-friendly program written in portable C99, based on a parallelized version of the Deep-Shallow suffix array construction algorithm, which is ...
Homann R +3 more
europepmc +4 more sources
Suffix sorting via matching statistics [PDF]
We introduce a new algorithm for constructing the generalized suffix array of a collection of highly similar strings. As a first step, we construct a compressed representation of the matching statistics of the collection with respect to a reference ...
Zsuzsanna Lipták +2 more
doaj +2 more sources

