Results 251 to 260 of about 568,068 (285)
Some of the next articles are maybe not open access.
Transforming Pairwise Duplicates to Entity Clusters for High-quality Duplicate Detection
Journal of Data and Information Quality, 2019Duplicate detection algorithms produce clusters of database records, each cluster representing a single real-world entity. As most of these algorithms use pairwise comparisons, the resulting (transitive) clusters can be inconsistent: Not all records within a cluster are sufficiently similar to be classified as duplicate.
Draisbach, Uwe +2 more
openaire +2 more sources
The Detection Algorithms for Similar Duplicate Data
2019 6th International Conference on Systems and Informatics (ICSAI), 2019This paper studied and analyzed three algorithms which can be used to detect similar duplicate data records. Among them, the two commonly used duplicate data detection algorithms are basic sorted-neighborhood method (SNM) and priority queue algorithm. Both of them are based on sorting-merger thought. The third one is Density-Based Spatial Clustering of
Jin-Yu Song, Quan Yu, Ruo-yu Bao
openaire +1 more source
Detecting Duplicate Pull-requests in GitHub
Proceedings of the 9th Asia-Pacific Symposium on Internetware, 2017The widespread use of pull-requests boosts the development and evolution for many open source software projects. However, due to the parallel and uncoordinated nature of development process in GitHub, duplicate pull-requests may be submitted by different contributors to solve the same problem.
Zhixing Li +4 more
openaire +1 more source
Duplicates Detection, Counting, and Removal
1992The need to detect and eliminate duplicate elements arises in many applications such as the processing of relational database operations, comparison of complex objects, transitive closure, and protocol verification. Given a multiset, the process of detecting duplicates, eliminating them, sorting the remaining distinct elements, and counting the number ...
openaire +1 more source
Duplicate detection algorithms of bibliographic descriptions
Library Hi Tech, 2008Purpose – The purpose of this paper is to focus on duplicate record detection algorithms used for detection in bibliographic databases.Design/methodology/approach – Individual algorithms, their application process for duplicate detection and their results are described based on available literature (published articles), information found at various ...
Anestis Sitas, Sarantos Kapidakis
openaire +1 more source
Evaluating Indeterministic Duplicate Detection Results
2012Duplicate detection is an important process for cleaning or integrating data. Since real-life data is often polluted, detecting duplicates usually comes along with uncertainty. To handle duplicate uncertainty in an appropriate way, indeterministic duplicate detection approaches, i.e.
Panse, Fabian, Ritter, Norbert
openaire +2 more sources
Unsupervised Duplicate Detection Using Sample Non-Duplicates
2008The problem of identifying objects in databases that refer to the same real world entity, is known, among others, as duplicate detection or record linkage. Objects may be duplicates, even though they are not identical due to errors and missing data. Traditional scenarios for duplicate detection are data warehouses, which are populated from several data
openaire +2 more sources
Duplicate document detection by template matching
Image and Vision Computing, 2000Abstract We discuss some operational issues pertaining to the detection of duplicates in the databases of bitmapped binary document images, and reason that efficient and effective duplicate document detection probably needs a combination of an efficient primary detector and an effective subordinate detector to be achieved.
openaire +1 more source
Duplicate Journal Title Detection in References
2009Our research efforts are oriented towards applying text mining techniques in order to help librarians make more informative decisions when selecting learning resources to be included in the library’s offer. The proper selection of learning resources to be included in the library’s offer is one of the key factors determining the overall usefulness of ...
Kovačević, A., Devedžić, Vladan
openaire +2 more sources
Does Deep Learning improve the performance of duplicate bug report detection? An empirical study
Journal of Systems and Software, 2023Yuan Jiang +2 more
exaly

