Results 221 to 230 of about 7,953 (261)
Efficient Metric Indexing for Similarity Search and Similarity Joins
Spatial queries including similarity search and similarity joins are useful in many areas, such as multimedia retrieval, data integration, and so on. However, they are not supported well by commercial DBMSs. This may be due to the complex data types involved and the needs for flexible similarity criteria seen in real applications.
Lu Chen +2 more
exaly +4 more sources
With the increasing ability of current applications to produce and consume more complex data, such as images and geographic information, the similarity join has attracted considerable attention. However, this operator does not consider the relationship among the elements in the answer, generating results with many pairs similar among themselves, which ...
LĂșcio F. D. Santos +4 more
openaire +2 more sources
Similarity Joins of Sparse Features
Identifying all pairs of records from two datasets whose similarity exceeds a given threshold is crucial for data cleaning and clustering. Our work on similarity-joins is motivated by detecting fraud and abuse. We focus on similarity-joins of sparse features, where records represent sparse sets, multisets, or vectors.
Ahmed Metwally 0001, Michael Shum
openaire +2 more sources
Similarity joins for uncertain strings
A string similarity join finds all similar string pairs between two input string collections. It is an essential operation in many applications, such as data integration and cleaning, and has been extensively studied for deterministic strings. Increasingly, many applications have to deal with imprecise strings or strings with fuzzy information in them.
Manish Patil, Rahul Shah 0001
openaire +2 more sources
The similarity join database operator
Similarity joins have been studied as key operations in multiple application domains, e.g., record linkage, data cleaning, multimedia and video applications, and phenomena detection on sensor networks. Multiple similarity join algorithms and implementation techniques have been proposed.
Silva, Yasin +2 more
openaire +3 more sources
Some of the next articles are maybe not open access.
Related searches:
Related searches:
GPU Acceleration of Set Similarity Joins
Lecture Notes in Computer Science, 2015We propose a scheme of efficient set similarity joins on Graphics Processing Units GPUs. Due to the rapid growth and diversification of data, there is an increasing demand for fast execution of set similarity joins in applications that vary from data integration to plagiarism detection.
Yusuke Kozawa +2 more
exaly +2 more sources
Parallel trajectory similarity joins in spatial networks [PDF]
The matching of similar pairs of objects, called similarity join, is fundamental functionality in data management. We consider two cases of trajectory similarity joins (TS-Joins), including a threshold-based join (Tb-TS-Join) and a top-k TS-Join (k-TS ...
Shuo Shang, Lisi Chen, Zhewei Wei
exaly +3 more sources
Similarity Joins on Item Set Collections Using Zero-Suppressed Binary Decision Diagrams [PDF]
Similarity joins between two collections of item sets have recently been investigated and have attracted significant attention, especially for linguistic applications such as those involving spelling error corrections and data cleaning. In this paper, we
Oyama Satoshi
exaly +2 more sources
2008 IEEE 24th International Conference on Data Engineering, 2008
Similarity joins have attracted significant interest, with applications in geographical information systems, astronomy, marketing analyzes, and anomaly detection. However, all the past algorithms, although highly fine-tuned, suffer an output explosion if the query range is even moderately large relative to the local data density.
Brent Bryan +2 more
openaire +1 more source
Similarity joins have attracted significant interest, with applications in geographical information systems, astronomy, marketing analyzes, and anomaly detection. However, all the past algorithms, although highly fine-tuned, suffer an output explosion if the query range is even moderately large relative to the local data density.
Brent Bryan +2 more
openaire +1 more source
Streaming Set Similarity Joins
2021We consider the problem of efficiently answering set similarity joins over streams. This problem is challenging both in terms of CPU cost, because similarity matching is computationally much more expensive than equality comparisons, and memory requirements, due to the unbounded nature of streams.
Lucas PacĂfico +1 more
openaire +1 more source

