Results 1 to 10 of about 7,384 (134)
Efficient and Scalable Graph Similarity Joins in MapReduce [PDF]
Along with the emergence of massive graph-modeled data, it is of great importance to investigate graph similarity joins due to their wide applications for multiple purposes, including data cleaning, and near duplicate detection.
Yifan Chen +4 more
doaj +2 more sources
Handling data-skewness in character based string similarity join using Hadoop [PDF]
The scalability of similarity joins is threatened by the unexpected data characteristic of data skewness. This is a pervasive problem in scientific data.
Kanak Meena +3 more
doaj +1 more source
Similarity Join and Similarity Self-Join Size Estimation in a Streaming Environment [PDF]
We study the problem of similarity self-join and similarity join size estimation in a streaming setting where the goal is to estimate, in one scan of the input and with sublinear space in the input size, the number of record pairs that have a similarity within a given threshold.
Davood Rafiei, Fan Deng 0004
openaire +2 more sources
Preference-driven similarity join [PDF]
Similarity join, which can find similar objects (e.g., products, names, addresses) across different sources, is powerful in dealing with variety in big data, especially web data. Threshold-driven similarity join, which has been extensively studied in the past, assumes that a user is able to specify a similarity threshold, and then focuses on how to ...
Chuancong Gao +4 more
openaire +2 more sources
Streaming similarity self-join [PDF]
We introduce and study the problem of computing the similarity self-join in a streaming context (SSSJ), where the input is an unbounded stream of items arriving continuously. The goal is to find all pairs of items in the stream whose similarity is greater than a given threshold.
Gionis Aristides +1 more
openaire +4 more sources
Dynamic Enumeration of Similarity Joins
This paper considers enumerating answers to similarity-join queries under dynamic updates: Given two sets of $n$ points $A,B$ in $\mathbb{R}^d$, a metric $ϕ(\cdot)$, and a distance threshold $r > 0$, report all pairs of points $(a, b) \in A \times B$ with $ϕ(a,b) \le r$.
Agarwal, Pankaj K. +3 more
openaire +4 more sources
Spatio-textual similarity joins [PDF]
Given a collection of objects that carry both spatial and textual information, a spatio-textual similarity join retrieves the pairs of objects that are spatially close and textually similar. As an example, consider a social network with spatially and textually tagged persons (i.e., their locations and profiles).
Panagiotis Bouros +2 more
openaire +2 more sources
Laser Metal Deposition and Wire Arc Additive Manufacturing of Materials: An Overview [PDF]
Additive manufacturing (AM) is a process that joins similar or dissimilar materials into application-oriented objects in a wide range of sizes and shapes.
R. Rumman +3 more
doaj +1 more source
A similarity network for web services operations substitution
Studying the similarity between Web services operations is a key solution to many problems, especially those related to the substitution, such as during a call failure or a malfunction, a list of similar operations is returned to the customer. He chooses
Rekkal Sara +2 more
doaj +1 more source
A game theoretic approach to balance privacy risks and familial benefits
As recreational genomics continues to grow in its popularity, many people are afforded the opportunity to share their genomes in exchange for various services, including third-party interpretation (TPI) tools, to understand their predisposition to health
Jia Guo +7 more
doaj +1 more source

