Aggregate queries on probabilistic record linkages [PDF]
Record linkage analysis, which matches records referring to the same real world entities from different data sets, is an important task in data integration. Uncertainty often exists in record linkages due to incompleteness or ambiguity in data.
Ming Hua 0001, Jian Pei 0001
openaire +1 more source
Duplicate Detection in Probabilistic Data [PDF]
Collected data often contains uncertainties. Probabilistic databases have been proposed to manage uncertain data. To combine data from multiple autonomous probabilistic databases, an integration of probabilistic data has to be performed.
Keijzer, Ander de +3 more
core +10 more sources
An Introduction to Probabilistic Record Linkage [PDF]
Since its post-World War II inception, the science of record linkage has grown exponentially and is used across industrial, governmental, and academic agencies. The academic fields that rely on record linkage are diverse, ranging from history to public health to demography.
Jana Asher +4 more
openaire +1 more source
Probabilistic linkage to enhance deterministic algorithms and reduce data linkage errors in hospital administrative data. [PDF]
BACKGROUND: The pseudonymisation algorithm used to link together episodes of care belonging to the same patients in England (HESID) has never undergone any formal evaluation, to determine the extent of data linkage error.
Aldridge, Robert +4 more
core +3 more sources
A Probabilistic Record Linkage Model for Survival Data [PDF]
AbstractIn absence of an unique identifier, combining information from multiple files relies on partially identifying variables (e.g. gender, initials). With a record linkage procedure, these variables are used to distinguish record pairs that belong together (matches) from record pairs that do not belong together (non-matches). Generally, the combined
Hof, Michel H. +2 more
openaire +2 more sources
HIV and cancer registry linkage identifies a substantial burden of cancers in persons with HIV in India. [PDF]
We utilized computerized record-linkage methods to link HIV and cancer databases with limited unique identifiers in Pune, India, to determine feasibility of linkage and obtain preliminary estimates of cancer risk in persons living with HIV (PLHIV) as ...
Bhatia, Kishor +13 more
core +1 more source
Revisiting the probabilistic method of record linkage
In theory, the probabilistic linkage method provides two distinct advantages over non-probabilistic methods, including minimal rates of linkage error and accurate measures of these rates for data users. However, implementations can fall short of these expectations either because the conditional independence assumption is made, or because a model with ...
Abel Dasylva +3 more
openaire +2 more sources
Validating Distance-Based Record Linkage with Probabilistic Record Linkage [PDF]
This work compares two alternative methods for record linkage: distance based and probabilistic record linkage. It compares the performance of both approaches when data is categorical. To that end, a distance over ordinal and nominal scales is defined.
Josep Domingo-Ferrer, Vicenç Torra
openaire +1 more source
Probabilistic Clustering of Time-Evolving Distance Data [PDF]
We present a novel probabilistic clustering model for objects that are represented via pairwise distances and observed at different time points. The proposed method utilizes the information given by adjacent time points to find the underlying cluster ...
AK Jain +27 more
core +1 more source
Sociodemographic and Health Characteristics, Rather Than Primary Care Supply, are Major Drivers of Geographic Variation in Preventable Hospitalizations in Australia [PDF]
ACKNOWLEDGMENTS: The authors thank the many thousands of people participating in the 45 and Up Study. The authors also thank the Sax Institute, the NSW Ministry of Health, and the NSW Register of Births, Deaths, and Marriages for allowing access to the ...
Blyth, Fiona M +5 more
core +1 more source

