Results 21 to 30 of about 8,758 (210)

Big data cleaning modeling of operation status of coal mine fully—mechanized coal mining equipment

open access: yesGong-kuang zidonghua, 2018
In view of problems of large amount of data and noise and missing values existed in data of operation status of coal mine fully—mechanized coal mining equipment, a big data cleaning model of operation status of coal mine fully—mechanized coal mining ...
MA Hongwei   +4 more
doaj   +1 more source

Parallel Computation of Rough Set Approximations in Information Systems with Missing Decision Data

open access: yesComputers, 2018
The paper discusses the use of parallel computation to obtain rough set approximations from large-scale information systems where missing data exist in both condition and decision attributes.
Thinh Cao   +4 more
doaj   +1 more source

Computing marginals using MapReduce [PDF]

open access: yesJournal of Computer and System Sciences, 2018
We consider the problem of computing the data-cube marginals of a fixed order $k$ (i.e., all marginals that aggregate over $k$ dimensions), using a single round of MapReduce. The focus is on the relationship between the reducer size (number of inputs allowed at a single reducer) and the replication rate (number of reducers to which an input is sent ...
Afrati, Foto N.   +3 more
openaire   +3 more sources

Spatial hotspot detection using polygon propagation

open access: yesInternational Journal of Digital Earth, 2019
Spatial scan statistics is one of the most important models in order to detect high activity or hotspots in real world applications such as epidemiology, public health, astronomy and criminology applications on geographic data. Traditional scan statistic
Satya Katragadda   +2 more
doaj   +1 more source

An alternative C++-based HPC system for Hadoop MapReduce

open access: yesOpen Computer Science, 2022
MapReduce (MR) is a technique used to improve distributed data processing vastly and can massively speed up computation. Hadoop and MR rely on memory-intensive JVM and Java. A MR framework based on High-Performance Computing (HPC) could be used, which is
Srinivasakumar Vignesh   +3 more
doaj   +1 more source

Evaluating MapReduce for seismic data processing using a practical application

open access: yesTongxin xuebao, 2012
Huge amounts of seismic data undergo complex iterative processing in the oil industry to get knowledge of the earth’s subsurface structure to detect where oil can be found and recovered.To evaluate the suitability of MapReduce for seismic processing ...
Chang-hai ZHAO   +4 more
doaj   +2 more sources

A Distributed Approach for High-Dimensionality Heterogeneous Data Reduction

open access: yesIEEE Access, 2019
The recent explosion of data size in number of records and attributes has triggered the development of a number of Big Data analytics as well as parallel data processing methods and algorithms.
Rania Mkhinini Gahar   +3 more
doaj   +1 more source

Neural Network Models for Solar Irradiance Forecasting in Polluted Areas: A Comparative Study

open access: yesEnergy Science &Engineering, EarlyView.
Pollution‐aware hybrid ensemble model is proposed to forecast solar irradiance across eight diverse cities. The model integrates MLP, RNN, and NARX to handle varying atmospheric pollution levels. The model outperforms state‐of‐the‐art methods with enhanced accuracy and interpretability on standard solar irradiance data set.
Mujtaba Ali   +6 more
wiley   +1 more source

Applying MapReduce to Conformance Checking

open access: yesТруды Института системного программирования РАН, 2018
Process mining is a relatively new research field, offering methods of business processes analysis and improvement, which are based on studying their execution history (event logs).
I. S. Shugurov, A. A. Mitsyuk
doaj   +1 more source

Functional regression with intensively measured longitudinal outcomes: a new lens through data partitioning

open access: yesCanadian Journal of Statistics, Volume 53, Issue 4, December 2025.
Abstract Modern longitudinal data from wearable devices consist of biological signals at high‐frequency time points. Distributed statistical methods have emerged as a powerful tool to overcome the computational burden of estimation and inference with large data, but methodology for distributed functional regression remains limited.
Cole Manschot, Emily C. Hector
wiley   +1 more source

Home - About - Disclaimer - Privacy