Results 101 to 110 of about 9,135 (230)

A New Parallelization Method for K-means [PDF]

open access: yesarXiv, 2016
K-means is a popular clustering method used in data mining area. To work with large datasets, researchers propose PKMeans, which is a parallel k-means on MapReduce. However, the existing k-means parallelization methods including PKMeans have many limitations.
arxiv  

Research on application of Hadoop in personnel positioning software system

open access: yesGong-kuang zidonghua, 2017
In order to solve the problem that existing personnel locating system can not meet large data access requirement of large-scale coal mines, Hadoop was proposed to be used in personnel positioning software system, and parallel computing model MapReduce ...
WANG Wei
doaj   +1 more source

MapReduce Meets Fine-Grained Complexity: MapReduce Algorithms for APSP, Matrix Multiplication, 3-SUM, and Beyond [PDF]

open access: yesarXiv, 2019
Distributed processing frameworks, such as MapReduce, Hadoop, and Spark are popular systems for processing large amounts of data. The design of efficient algorithms in these frameworks is a challenging problem, as the systems both require parallelism---since datasets are so large that multiple machines are necessary---and limit the degree of ...
arxiv  

Building Wavelet Histograms on Large Data in MapReduce [PDF]

open access: yesProceedings of the VLDB Endowment (PVLDB), Vol. 5, No. 2, pp. 109-120 (2011), 2011
MapReduce is becoming the de facto framework for storing and processing massive data, due to its excellent scalability, reliability, and elasticity. In many MapReduce applications, obtaining a compact accurate summary of data is essential. Among various data summarization tools, histograms have proven to be particularly important and useful for ...
arxiv  

Cellular automata-based MapReduce design: Migrating a big data processing model from Industry 4.0 to Industry 5.0

open access: yese-Prime: Advances in Electrical Engineering, Electronics and Energy
A successful deployment of Industry 5.0 is significantly dependent on the synergetic integration of several advanced technologies such as big data processing, Artificial Intelligence (AI) integration, and several effective digitization techniques that ...
Arnab Mitra
doaj  

Energy management for MapReduce clusters [PDF]

open access: green, 2010
Willis Lang, Jignesh M. Patel
openalex   +1 more source

Testing MapReduce-Based Systems

open access: yes, 2011
MapReduce (MR) is the most popular solution to build applications for large-scale data processing. These applications are often deployed on large clusters of commodity machines, where failures happen constantly due to bugs, hardware problems, and outages.
Marynowski, João Eugenio   +3 more
openaire   +5 more sources

Column-oriented storage techniques for MapReduce [PDF]

open access: green, 2011
Avrilia Floratou   +3 more
openalex   +3 more sources

Distributed nonnegative matrix factorization for web-scale dyadic data analysis on mapreduce [PDF]

open access: green, 2010
Chao Liu   +4 more
openalex   +1 more source

Home - About - Disclaimer - Privacy