Results 1 to 10 of about 10,473 (123)
Error Correcting Codes, Perfect Hashing Circuits, and Deterministic Dynamic Dictionaries
We consider dictionaries of size n over the finite universe U ={0, 1}^w and introduce a new technique for their implementation: error correcting codes. The use of such codes makes it possible to replace the use of strong forms of hashing, such as universal hashing, with much weaker forms, such as clustering.<br />We use our approach to construct,
Peter Bro Miltersen
+7 more sources
Simple, compact and robust approximate string dictionary [PDF]
This paper is concerned with practical implementations of approximate string dictionaries that allow edit errors. In this problem, we have as input a dictionary $D$ of $d$ strings of total length $n$ over an alphabet of size $\sigma$.
Belazzougui, Djamal, Chegrane, Ibrahim
core +1 more source
Distributed PCP Theorems for Hardness of Approximation in P [PDF]
We present a new distributed model of probabilistically checkable proofs (PCP). A satisfying assignment $x \in \{0,1\}^n$ to a CNF formula $\varphi$ is shared between two parties, where Alice knows $x_1, \dots, x_{n/2}$, Bob knows $x_{n/2+1},\dots,x_n ...
Abboud, Amir +2 more
core +7 more sources
DiBELLA: Distributed long read to long read alignment [PDF]
We present a parallel algorithm and scalable implementation for genome analysis, specifically the problem of finding overlaps and alignments for data from "third generation" long read sequencers [29]. While long sequences of DNA offer enormous advantages
Buluç, A +4 more
core +2 more sources
SLIM : Scalable Linkage of Mobility Data [PDF]
We present a scalable solution to link entities across mobility datasets using their spatio-temporal information. This is a fundamental problem in many applications such as linking user identities for security, understanding privacy limitations of ...
Atluri Gowtham +7 more
core +3 more sources
Handling Massive N-Gram Datasets Efficiently [PDF]
This paper deals with the two fundamental problems concerning the handling of large n-gram language models: indexing, that is compressing the n-gram strings and associated satellite data without compromising their retrieval speed; and estimation, that is
Pibiri, Giulio Ermanno +1 more
core +3 more sources
Scalable and Sustainable Deep Learning via Randomized Hashing
Current deep learning architectures are growing larger in order to learn from complex datasets. These architectures require giant matrix multiplication operations to train millions of parameters.
Chen Wenlin +8 more
core +1 more source
Weighted ancestors in suffix trees [PDF]
The classical, ubiquitous, predecessor problem is to construct a data structure for a set of integers that supports fast predecessor queries. Its generalization to weighted trees, a.k.a.
D.E. Willard +6 more
core +1 more source
Improved Approximate String Matching and Regular Expression Matching on Ziv-Lempel Compressed Texts
We study the approximate string matching and regular expression matching problem for the case when the text to be searched is compressed with the Ziv-Lempel adaptive dictionary compression schemes.
A. Amir +15 more
core +4 more sources
Optimal Las Vegas Locality Sensitive Data Structures
We show that approximate similarity (near neighbour) search can be solved in high dimensions with performance matching state of the art (data independent) Locality Sensitive Hashing, but with a guarantee of no false negatives. Specifically, we give two
Ahle, Thomas Dybdahl
core +1 more source

