Results 41 to 50 of about 37,020 (188)
Efficient privacy-preserving variable-length substring match for genome sequence
The development of a privacy-preserving technology is important for accelerating genome data sharing. This study proposes an algorithm that securely searches a variable-length substring match between a query and a database sequence. Our concept hinges on
Yoshiki Nakagawa +2 more
doaj +1 more source
Engineering External Memory LCP Array Construction: Parallel, In-Place and Large Alphabet [PDF]
The suffix array augmented with the LCP array is perhaps the most important data structure in modern string processing. There has been a lot of recent research activity on constructing these arrays in external memory.
, Kempa, Dominik
core +1 more source
Suppose we have a large dictionary of strings. Each entry starts with a figure of merit (popularity). We wish to find the k-best matches for a substring, s, in a dictinoary, dict. That is, grep s dict | sort -n | head -k, but we would like to do this in sublinear time.
Kenneth Church +2 more
openaire +1 more source
KVLMM: A Trajectory Prediction Method Based on a Variable-Order Markov Model With Kernel Smoothing
With the dramatic proliferation of global positioning system (GPS) devices, a rich range of research has been conducted on the analysis of GPS trajectories.
Xing Wang +3 more
doaj +1 more source
New Algorithms for Position Heaps
We present several results about position heaps, a relatively new alternative to suffix trees and suffix arrays. First, we show that, if we limit the maximum length of patterns to be sought, then we can also limit the height of the heap and reduce the ...
A. Ehrenfeucht +7 more
core +1 more source
On the combinatorics of suffix arrays [PDF]
We prove several combinatorial properties of suffix arrays, including a characterization of suffix arrays through a bijection with a certain well-defined class of permutations. Our approach is based on the characterization of Burrows-Wheeler arrays given
Kucherov, Gregory +2 more
core +5 more sources
RAPSearch: a fast protein similarity search tool for short reads
Background Next Generation Sequencing (NGS) is producing enormous corpuses of short DNA reads, affecting emerging fields like metagenomics. Protein similarity search--a key step to achieve annotation of protein-coding genes in these short reads, and ...
Choi Jeong-Hyeon, Ye Yuzhen, Tang Haixu
doaj +1 more source
Development of Fingerprint Identification Based on Device Flow in Industrial Control System
With the rapid development of industrial automation technology, a large number of industrial control devices have emerged in cyberspace, but the security of open cyberspace is difficult to guarantee.
Jun Tao +3 more
doaj +1 more source
Representing the Suffix Tree with the CDAWG [PDF]
Given a string T, it is known that its suffix tree can be represented using the compact directed acyclic word graph (CDAWG) with e_T arcs, taking overall O(e_T+e_REV(T)) words of space, where REV(T) is the reverse of T, and supporting some key operations
Belazzougui, Djamal, Cunial, Fabio
core +2 more sources
Low Space External Memory Construction of the Succinct Permuted Longest Common Prefix Array
The longest common prefix (LCP) array is a versatile auxiliary data structure in indexed string matching. It can be used to speed up searching using the suffix array (SA) and provides an implicit representation of the topology of an underlying suffix ...
D Okanohara +20 more
core +1 more source

