Results 241 to 250 of about 7,953 (261)
Some of the next articles are maybe not open access.

String similarity search and join: a survey

Frontiers of Computer Science, 2015
String similarity search and join are two important operations in data cleaning and integration, which extend traditional exact search and exact join operations in databases by tolerating the errors and inconsistencies in the data. They have many real-world applications, such as spell checking, duplicate detection, entity resolution, and webpage ...
Minghe Yu 0001   +3 more
openaire   +1 more source

An Efficient Similarity Join Algorithm with Cosine Similarity Predicate

2010
Given a large collection of objects, finding all pairs of similar objects, namely similarity join, is widely used to solve various problems in many application domains.Computation time of similarity join is critical issue, since similarity join requires computing similarity values for all possible pairs of objects.
Dongjoo Lee   +3 more
openaire   +1 more source

Parallelizing String Similarity Join Algorithms

2018
A key operation in data cleaning and integration is the use of string similarity join (SSJ) algorithms to identify and remove duplicates or similar records within data sets. With the advent of big data, a natural question is how to parallelize SSJ algorithms.
Ling-Chih Yao, Lipyeow Lim
openaire   +1 more source

String Similarity Join with Different Thresholds

2015
String similarity join is an essential operation of many applications that need to find all similar string pairs from given two collections. The existing approaches are using the uniform and predefined similarity thresholds. While in real applications, regarding that the longer string pairs typically tolerate many more typos, it is necessary to apply ...
Chuitian Rong, Xiangling Zhang
openaire   +1 more source

An efficient algorithm for approximated self-similarity joins in metric spaces

Information Systems, 2020
Sebastian Ferrada   +2 more
exaly  

Privacy preserving similarity joins using MapReduce

Information Sciences, 2019
Xiaofeng Ding   +2 more
exaly  

VChunkJoin: An Efficient Algorithm for Edit Similarity Joins

IEEE Transactions on Knowledge and Data Engineering, 2013
Jianbin Qin, Chuan Xiao, Xuemin Lin
exaly  

Output-Optimal Massively Parallel Algorithms for Similarity Joins

ACM Transactions on Database Systems, 2019
Ke Yi, Yufei Tao
exaly  

An Experimental Survey of MapReduce-Based Similarity Joins

Lecture Notes in Computer Science, 2016
Yasin N Silva, Chuitian Rong
exaly  

Home - About - Disclaimer - Privacy