Results 1 to 10 of about 1,121,640 (280)

Cleaning Data With Selection Rules

open access: yesIEEE Access, 2022
In this paper, we propose and study a type of tuple-level constraint that arises from the selection operator $\sigma $ of relational algebra and that closely resembles the concepts of tuple-level denial constraints.
Toon Boeckling   +2 more
doaj   +1 more source

LR-BCA: Label Ranking for Bridge Condition Assessment

open access: yesIEEE Access, 2021
Bridge condition assessment (BCA) plays an important role in modern bridge management. Existing assessment methods are time-consuming, labor-intensive and error-prone. The use of machine learning for BCA can effectively solve the above problems. However,
Kai Wang, Tong Ruan, Faxiang Xie
doaj   +1 more source

Establishing a sorting protocol for healthcare databases

open access: yesJournal of Public Health Research, 2021
Background: Health information records in many countries, especially developing countries, are still paper based. Compared to electronic systems, paper-based systems are disadvantageous in terms of data storage and data extraction.
Elie Ghabi   +7 more
doaj   +1 more source

The Effect of Preprocessing Techniques, Applied to Numeric Features, on Classification Algorithms’ Performance

open access: yesData, 2021
It is recognized that the performance of any prediction model is a function of several factors. One of the most significant factors is the adopted preprocessing techniques.
Esra’a Alshdaifat   +4 more
doaj   +1 more source

Exploratory Data Mining and Data Cleaning

open access: yesJournal of Statistical Software, 2004
s not available for ...
Nicholas Cox
doaj   +1 more source

A revival of integrity constraints for data cleaning [PDF]

open access: yes, 2008
Integrity constraints, a.k.a . data dependencies, are being widely used for improving the quality of schema .
Fan, Wenfei, Geerts, Floris, Jia, Xibei
core   +1 more source

A Protocol for Collecting Burned Area Time Series Cross-Check Data

open access: yesFire, 2022
Data on wildfire growth are useful for multiple research purposes but are frequently unavailable and often have data quality problems. For these reasons, we developed a protocol for collecting daily burned area time series from the InciWeb website ...
Harry R. Podschwit   +2 more
doaj   +1 more source

Data Validation Infrastructure for R

open access: yesJournal of Statistical Software, 2021
Checking data quality against domain knowledge is a common activity that pervades statistical analysis from raw data to output. The R package validate facilitates this task by capturing and applying expert knowledge in the form of validation rules ...
Mark P. J. van der Loo, Edwin de Jonge
doaj   +1 more source

Interval model of a wind turbine power curve

open access: yesFrontiers in Energy Research, 2023
The wind turbine power curve model is critical to a wind turbine’s power prediction and performance analysis. However, abnormal data in the training set decrease the prediction accuracy of trained models.
Kai Zhou   +9 more
doaj   +1 more source

Home - About - Disclaimer - Privacy