Speeding up tandem mass spectrometry-based database searching by longest common prefix
Background Tandem mass spectrometry-based database searching has become an important technology for peptide and protein identification. One of the key challenges in database searching is the remarkable increase in computational demand, brought about by ...
Wang Le-Heng +7 more
doaj +1 more source
Sampling the Suffix Array with Minimizers [PDF]
Sampling (evenly) the suffixes from the suffix array is an old idea trading the pattern search time for reduced index space. A few years ago Claude et al. showed an alphabet sampling scheme allowing for more efficient pattern searches compared to the sparse suffix array, for long enough patterns.
Grabowski, Szymon, Raniszewski, Marcin
openaire +2 more sources
Kohdista: an efficient method to index and query possible Rmap alignments
Background Genome-wide optical maps are ordered high-resolution restriction maps that give the position of occurrence of restriction cut sites corresponding to one or more restriction enzymes. These genome-wide optical maps are assembled using an overlap-
Martin D. Muggli +2 more
doaj +1 more source
StreamAligner: a streaming based sequence aligner on Apache Spark
Next-Generation Sequencing technologies are generating a huge amount of genetic data that need to be mapped and analyzed. Single machine sequence alignment tools are becoming incapable or inefficient in keeping track of the same.
Sanjay Rathee, Arti Kashyap
doaj +1 more source
XenDB: Full length cDNA prediction and cross species mapping in
Background Research using the model system Xenopus laevis has provided critical insights into the mechanisms of early vertebrate development and cell biology. Large scale sequencing efforts have provided an increasingly important resource for researchers.
Giegerich Robert +4 more
doaj +1 more source
Fast index based algorithms and software for matching position specific scoring matrices
Background In biological sequence analysis, position specific scoring matrices (PSSMs) are widely used to represent sequence motifs in nucleotide as well as amino acid sequences.
Homann Robert +3 more
doaj +1 more source
Finding All-Pairs Suffix-Prefix Matching Using Suffix Array
ABSTRACT Since string operations were applied to computational biology, security and search for Internet, various data structures and algorithms for computing efficient string operations have been studied. The all-pairs suffix-prefix matching is to find the longest suffix and prefix among given strings.
Seon-Mi Han, Jin-Woon Woo
openaire +2 more sources
SArKS: de novo discovery of gene expression regulatory motif sites and domains by suffix array kernel smoothing. [PDF]
Wylie DC, Hofmann HA, Zemelman BV.
europepmc +1 more source
RIsearch2: suffix array-based large-scale prediction of RNA-RNA interactions and siRNA off-targets. [PDF]
Alkan F +7 more
europepmc +1 more source
SA-SSR: a suffix array-based algorithm for exhaustive and efficient SSR discovery in large genetic sequences. [PDF]
Pickett BD +7 more
europepmc +1 more source

