Similarity join - Open Access .click

Results 11 to 20 of about 303,258 (276)

Proceedings of the VLDB Endowment, 2016
We introduce and study the problem of computing the similarity self-join in a streaming context (SSSJ), where the input is an unbounded stream of items arriving continuously.
Gionis, Aristides, Morales, Gianmarco De Francisci +1 more
core +5 more sources

Scalable and Robust Set Similarity Join [PDF]

2018 IEEE 34th International Conference on Data Engineering (ICDE), 2018
Set similarity join is a fundamental and well-studied database operator. It is usually studied in the exact setting where the goal is to compute all pairs of sets that exceed a given similarity threshold (measured e.g. as Jaccard similarity).
Christiani, Tobias Lybecker +2 more
core +5 more sources

I/O-Efficient Similarity Join [PDF]

Algorithmica, 2015
We present an I/O-efficient algorithm for computing similarity joins based on locality-sensitive hashing (LSH). In contrast to the filtering methods commonly suggested our method has provable sub-quadratic dependency on the data size.
Pagh, Rasmus +3 more
core +10 more sources

Adaptive MapReduce Similarity Joins [PDF]

Proceedings of the 5th ACM SIGMOD Workshop on Algorithms and Systems for MapReduce and Beyond, 2018
Similarity joins are a fundamental database operation. Given data sets S and R, the goal of a similarity join is to find all points x in S and y in R with distance at most r.
McCauley, Samuel, Silvestri, Francesco
core +6 more sources

Preference-driven similarity join [PDF]

Proceedings of the International Conference on Web Intelligence, 2017
Similarity join, which can find similar objects (e.g., products, names, addresses) across different sources, is powerful in dealing with variety in big data, especially web data. Threshold-driven similarity join, which has been extensively studied in the past, assumes that a user is able to specify a similarity threshold, and then focuses on how to ...
Chuancong Gao +4 more
openaire +2 more sources

Dynamic Enumeration of Similarity Joins

CoRR, 2021
This paper considers enumerating answers to similarity-join queries under dynamic updates: Given two sets of $n$ points $A,B$ in $\mathbb{R}^d$, a metric $ϕ(\cdot)$, and a distance threshold $r > 0$, report all pairs of points $(a, b) \in A \times B$ with $ϕ(a,b) \le r$.
Agarwal, Pankaj K., Hu, Xiao, Sintos, Stavros, Yang, Jun +3 more
openaire +4 more sources

Spatio-textual similarity joins [PDF]

Proceedings of the VLDB Endowment, 2012
Given a collection of objects that carry both spatial and textual information, a spatio-textual similarity join retrieves the pairs of objects that are spatially close and textually similar. As an example, consider a social network with spatially and textually tagged persons (i.e., their locations and profiles).
Panagiotis Bouros, Shen Ge, Nikos Mamoulis +2 more
openaire +2 more sources

Projection Based Large Scale High-Dimensional Data Similarity Join Using MapReduce Framework

IEEE Access, 2020
Similarity join has been widely used in many data analysis and data mining applications, we mainly focus on the scalability and performance problem of similarity join query on massive high-dimensional data set.
Youzhong Ma +3 more
doaj +1 more source

On link-based similarity join [PDF]

Proceedings of the VLDB Endowment, 2011
Graphs can be found in applications like social networks, bibliographic networks, and biological databases. Understanding the relationship, or links , among graph nodes enables applications such as link prediction, recommendation, and spam detection. In this paper, we propose link-based similarity join
Liwen Sun +4 more
openaire +1 more source

Citizen Scientists Are Not Just Quiz Takers: Information about Project Type Influences Data Disclosure in Online Psychological Surveys

Citizen Science: Theory and Practice, 2022
Traditionally, citizen science has centred on giving lay people opportunities to learn about science by participating in it. Lately, psychological citizen science projects have increasingly aimed to attract participants by providing an opportunity for ...
Anna Rudnicka, Sandy Gould, Anna Cox
doaj +1 more source

medicine