Results 251 to 260 of about 21,703 (289)

Ontology- and LLM-based data harmonization for federated learning in healthcare. [PDF]

open access: yesFront Digit Health
Kokash N   +7 more
europepmc   +1 more source

Kandinsky: enabling neighbourhood analysis of spatial omics data for functional insights on cell ecosystems

open access: yes
Andrei P   +7 more
europepmc   +1 more source

The similarity join database operator

open access: yes2010 IEEE 26th International Conference on Data Engineering (ICDE 2010), 2010
Similarity joins have been studied as key operations in multiple application domains, e.g., record linkage, data cleaning, multimedia and video applications, and phenomena detection on sensor networks. Multiple similarity join algorithms and implementation techniques have been proposed.
Silva, Yasin   +2 more
openaire   +3 more sources

Document Similarity Self-Join with MapReduce

open access: yes2010 IEEE International Conference on Data Mining, 2010
Given a collection of objects, the Similarity Self-Join problem requires to discover all those pairs of objects whose similarity is above a user defined threshold. In this paper we focus on document collections, which are characterized by a sparseness that allows effective pruning strategies.
Ranieri Baraglia   +2 more
openaire   +2 more sources

String similarity search and join: a survey

Frontiers of Computer Science, 2015
String similarity search and join are two important operations in data cleaning and integration, which extend traditional exact search and exact join operations in databases by tolerating the errors and inconsistencies in the data. They have many real-world applications, such as spell checking, duplicate detection, entity resolution, and webpage ...
Guoliang Li, Dong Deng, Feng Jianhua
exaly   +2 more sources

Compact Similarity Joins

2008 IEEE 24th International Conference on Data Engineering, 2008
Similarity joins have attracted significant interest, with applications in geographical information systems, astronomy, marketing analyzes, and anomaly detection. However, all the past algorithms, although highly fine-tuned, suffer an output explosion if the query range is even moderately large relative to the local data density.
Brent Bryan   +2 more
openaire   +1 more source

Similarity joins for uncertain strings

Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data, 2014
A string similarity join finds all similar string pairs between two input string collections. It is an essential operation in many applications, such as data integration and cleaning, and has been extensively studied for deterministic strings. Increasingly, many applications have to deal with imprecise strings or strings with fuzzy information in them.
Manish Patil, Rahul Shah 0001
openaire   +1 more source

Streaming Set Similarity Joins

2021
We consider the problem of efficiently answering set similarity joins over streams. This problem is challenging both in terms of CPU cost, because similarity matching is computationally much more expensive than equality comparisons, and memory requirements, due to the unbounded nature of streams.
Lucas PacĂ­fico   +1 more
openaire   +1 more source

High-dimensional similarity joins

IEEE Transactions on Knowledge and Data Engineering, 2002
Many emerging data mining applications require a similarity join between points in a high-dimensional domain. We present a new algorithm that utilizes a new index structure, called the /spl epsi/ tree, for fast spatial similarity joins on high-dimensional points.
Kyuseok Shim   +2 more
openaire   +1 more source

Home - About - Disclaimer - Privacy