Results 101 to 110 of about 9,135 (230)
A New Parallelization Method for K-means [PDF]
K-means is a popular clustering method used in data mining area. To work with large datasets, researchers propose PKMeans, which is a parallel k-means on MapReduce. However, the existing k-means parallelization methods including PKMeans have many limitations.
arxiv
Research on application of Hadoop in personnel positioning software system
In order to solve the problem that existing personnel locating system can not meet large data access requirement of large-scale coal mines, Hadoop was proposed to be used in personnel positioning software system, and parallel computing model MapReduce ...
WANG Wei
doaj +1 more source
MapReduce Meets Fine-Grained Complexity: MapReduce Algorithms for APSP, Matrix Multiplication, 3-SUM, and Beyond [PDF]
Distributed processing frameworks, such as MapReduce, Hadoop, and Spark are popular systems for processing large amounts of data. The design of efficient algorithms in these frameworks is a challenging problem, as the systems both require parallelism---since datasets are so large that multiple machines are necessary---and limit the degree of ...
arxiv
CloudBurst: highly sensitive read mapping with MapReduce [PDF]
Michael C. Schatz
openalex +1 more source
Building Wavelet Histograms on Large Data in MapReduce [PDF]
MapReduce is becoming the de facto framework for storing and processing massive data, due to its excellent scalability, reliability, and elasticity. In many MapReduce applications, obtaining a compact accurate summary of data is essential. Among various data summarization tools, histograms have proven to be particularly important and useful for ...
arxiv
A successful deployment of Industry 5.0 is significantly dependent on the synergetic integration of several advanced technologies such as big data processing, Artificial Intelligence (AI) integration, and several effective digitization techniques that ...
Arnab Mitra
doaj
Energy management for MapReduce clusters [PDF]
Willis Lang, Jignesh M. Patel
openalex +1 more source
Testing MapReduce-Based Systems
MapReduce (MR) is the most popular solution to build applications for large-scale data processing. These applications are often deployed on large clusters of commodity machines, where failures happen constantly due to bugs, hardware problems, and outages.
Marynowski, João Eugenio+3 more
openaire +5 more sources
Column-oriented storage techniques for MapReduce [PDF]
Avrilia Floratou+3 more
openalex +3 more sources
Distributed nonnegative matrix factorization for web-scale dyadic data analysis on mapreduce [PDF]
Chao Liu+4 more
openalex +1 more source