Results 11 to 20 of about 303,258 (276)
Streaming Similarity Self-Join [PDF]
We introduce and study the problem of computing the similarity self-join in a streaming context (SSSJ), where the input is an unbounded stream of items arriving continuously.
Gionis, Aristides +1 more
core +5 more sources
Scalable and Robust Set Similarity Join [PDF]
Set similarity join is a fundamental and well-studied database operator. It is usually studied in the exact setting where the goal is to compute all pairs of sets that exceed a given similarity threshold (measured e.g. as Jaccard similarity).
Christiani, Tobias Lybecker +2 more
core +5 more sources
I/O-Efficient Similarity Join [PDF]
We present an I/O-efficient algorithm for computing similarity joins based on locality-sensitive hashing (LSH). In contrast to the filtering methods commonly suggested our method has provable sub-quadratic dependency on the data size.
Pagh, Rasmus +3 more
core +10 more sources
Adaptive MapReduce Similarity Joins [PDF]
Similarity joins are a fundamental database operation. Given data sets S and R, the goal of a similarity join is to find all points x in S and y in R with distance at most r.
McCauley, Samuel, Silvestri, Francesco
core +6 more sources
Preference-driven similarity join [PDF]
Similarity join, which can find similar objects (e.g., products, names, addresses) across different sources, is powerful in dealing with variety in big data, especially web data. Threshold-driven similarity join, which has been extensively studied in the past, assumes that a user is able to specify a similarity threshold, and then focuses on how to ...
Chuancong Gao +4 more
openaire +2 more sources
Dynamic Enumeration of Similarity Joins
This paper considers enumerating answers to similarity-join queries under dynamic updates: Given two sets of $n$ points $A,B$ in $\mathbb{R}^d$, a metric $ϕ(\cdot)$, and a distance threshold $r > 0$, report all pairs of points $(a, b) \in A \times B$ with $ϕ(a,b) \le r$.
Agarwal, Pankaj K. +3 more
openaire +4 more sources
Spatio-textual similarity joins [PDF]
Given a collection of objects that carry both spatial and textual information, a spatio-textual similarity join retrieves the pairs of objects that are spatially close and textually similar. As an example, consider a social network with spatially and textually tagged persons (i.e., their locations and profiles).
Panagiotis Bouros +2 more
openaire +2 more sources
Projection Based Large Scale High-Dimensional Data Similarity Join Using MapReduce Framework
Similarity join has been widely used in many data analysis and data mining applications, we mainly focus on the scalability and performance problem of similarity join query on massive high-dimensional data set.
Youzhong Ma +3 more
doaj +1 more source
On link-based similarity join [PDF]
Graphs can be found in applications like social networks, bibliographic networks, and biological databases. Understanding the relationship, or links , among graph nodes enables applications such as link prediction, recommendation, and spam detection. In this paper, we propose link-based similarity join
Liwen Sun +4 more
openaire +1 more source
Traditionally, citizen science has centred on giving lay people opportunities to learn about science by participating in it. Lately, psychological citizen science projects have increasingly aimed to attract participants by providing an opportunity for ...
Anna Rudnicka, Sandy Gould, Anna Cox
doaj +1 more source

