Results 21 to 30 of about 8,758 (210)
Big data cleaning modeling of operation status of coal mine fully—mechanized coal mining equipment
In view of problems of large amount of data and noise and missing values existed in data of operation status of coal mine fully—mechanized coal mining equipment, a big data cleaning model of operation status of coal mine fully—mechanized coal mining ...
MA Hongwei +4 more
doaj +1 more source
Parallel Computation of Rough Set Approximations in Information Systems with Missing Decision Data
The paper discusses the use of parallel computation to obtain rough set approximations from large-scale information systems where missing data exist in both condition and decision attributes.
Thinh Cao +4 more
doaj +1 more source
Computing marginals using MapReduce [PDF]
We consider the problem of computing the data-cube marginals of a fixed order $k$ (i.e., all marginals that aggregate over $k$ dimensions), using a single round of MapReduce. The focus is on the relationship between the reducer size (number of inputs allowed at a single reducer) and the replication rate (number of reducers to which an input is sent ...
Afrati, Foto N. +3 more
openaire +3 more sources
Spatial hotspot detection using polygon propagation
Spatial scan statistics is one of the most important models in order to detect high activity or hotspots in real world applications such as epidemiology, public health, astronomy and criminology applications on geographic data. Traditional scan statistic
Satya Katragadda +2 more
doaj +1 more source
An alternative C++-based HPC system for Hadoop MapReduce
MapReduce (MR) is a technique used to improve distributed data processing vastly and can massively speed up computation. Hadoop and MR rely on memory-intensive JVM and Java. A MR framework based on High-Performance Computing (HPC) could be used, which is
Srinivasakumar Vignesh +3 more
doaj +1 more source
Evaluating MapReduce for seismic data processing using a practical application
Huge amounts of seismic data undergo complex iterative processing in the oil industry to get knowledge of the earth’s subsurface structure to detect where oil can be found and recovered.To evaluate the suitability of MapReduce for seismic processing ...
Chang-hai ZHAO +4 more
doaj +2 more sources
A Distributed Approach for High-Dimensionality Heterogeneous Data Reduction
The recent explosion of data size in number of records and attributes has triggered the development of a number of Big Data analytics as well as parallel data processing methods and algorithms.
Rania Mkhinini Gahar +3 more
doaj +1 more source
Neural Network Models for Solar Irradiance Forecasting in Polluted Areas: A Comparative Study
Pollution‐aware hybrid ensemble model is proposed to forecast solar irradiance across eight diverse cities. The model integrates MLP, RNN, and NARX to handle varying atmospheric pollution levels. The model outperforms state‐of‐the‐art methods with enhanced accuracy and interpretability on standard solar irradiance data set.
Mujtaba Ali +6 more
wiley +1 more source
Applying MapReduce to Conformance Checking
Process mining is a relatively new research field, offering methods of business processes analysis and improvement, which are based on studying their execution history (event logs).
I. S. Shugurov, A. A. Mitsyuk
doaj +1 more source
Abstract Modern longitudinal data from wearable devices consist of biological signals at high‐frequency time points. Distributed statistical methods have emerged as a powerful tool to overcome the computational burden of estimation and inference with large data, but methodology for distributed functional regression remains limited.
Cole Manschot, Emily C. Hector
wiley +1 more source

