Results 1 to 10 of about 1,121,640 (280)
Is Data Cleaning and the Testing of Assumptions still relevant in the 21st Century? [PDF]
Jason W Osborne
doaj +2 more sources
Cleaning Data With Selection Rules
In this paper, we propose and study a type of tuple-level constraint that arises from the selection operator $\sigma $ of relational algebra and that closely resembles the concepts of tuple-level denial constraints.
Toon Boeckling +2 more
doaj +1 more source
LR-BCA: Label Ranking for Bridge Condition Assessment
Bridge condition assessment (BCA) plays an important role in modern bridge management. Existing assessment methods are time-consuming, labor-intensive and error-prone. The use of machine learning for BCA can effectively solve the above problems. However,
Kai Wang, Tong Ruan, Faxiang Xie
doaj +1 more source
Establishing a sorting protocol for healthcare databases
Background: Health information records in many countries, especially developing countries, are still paper based. Compared to electronic systems, paper-based systems are disadvantageous in terms of data storage and data extraction.
Elie Ghabi +7 more
doaj +1 more source
It is recognized that the performance of any prediction model is a function of several factors. One of the most significant factors is the adopted preprocessing techniques.
Esra’a Alshdaifat +4 more
doaj +1 more source
Exploratory Data Mining and Data Cleaning
s not available for ...
Nicholas Cox
doaj +1 more source
A revival of integrity constraints for data cleaning [PDF]
Integrity constraints, a.k.a . data dependencies, are being widely used for improving the quality of schema .
Fan, Wenfei, Geerts, Floris, Jia, Xibei
core +1 more source
A Protocol for Collecting Burned Area Time Series Cross-Check Data
Data on wildfire growth are useful for multiple research purposes but are frequently unavailable and often have data quality problems. For these reasons, we developed a protocol for collecting daily burned area time series from the InciWeb website ...
Harry R. Podschwit +2 more
doaj +1 more source
Data Validation Infrastructure for R
Checking data quality against domain knowledge is a common activity that pervades statistical analysis from raw data to output. The R package validate facilitates this task by capturing and applying expert knowledge in the form of validation rules ...
Mark P. J. van der Loo, Edwin de Jonge
doaj +1 more source
Interval model of a wind turbine power curve
The wind turbine power curve model is critical to a wind turbine’s power prediction and performance analysis. However, abnormal data in the training set decrease the prediction accuracy of trained models.
Kai Zhou +9 more
doaj +1 more source

