Results 91 to 100 of about 9,135 (230)
Finding Top- $k$ Dominance on Incomplete Big Data Using MapReduce Framework
Incomplete data is one major kind of multi-dimensional dataset that has random-distributed missing nodes in its dimensions. It is very difficult to retrieve information from this type of dataset when it becomes large.
Payam Ezatpoor+3 more
doaj +1 more source
Energy Efficient Scheduling of MapReduce Jobs [PDF]
MapReduce is emerged as a prominent programming model for data-intensive computation. In this work, we study power-aware MapReduce scheduling in the speed scaling setting first introduced by Yao et al. [FOCS 1995]. We focus on the minimization of the total weighted completion time of a set of MapReduce jobs under a given budget of energy.
arxiv
MASSIVE SIMULATIONS USING MAPREDUCE MODEL
In the last few years cloud computing is growing as a dominant solution for large scale numerical problems. It is based on MapReduce programming model, which provides high scalability and flexibility, but also optimizes costs of computing infrastructure.
Artur Krupa, Bartosz Sawicki
doaj +1 more source
This invited paper introduces results on Web science and technology obtained during work with the Korea Advanced Institute of Science and Technology. In the first part, we discuss algorithms for exploring the deep Web, which refers to the collection of Web pages that cannot be reached by conventional Web crawlers. In the second part, we discuss sorting
openaire +4 more sources
Interactive Analytical Processing in Big Data Systems: A Cross-Industry Study of MapReduce Workloads [PDF]
Within the past few years, organizations in diverse industries have adopted MapReduce-based systems for large-scale data processing. Along with these new users, important new workloads have emerged which feature many small, short, and increasingly interactive jobs in addition to the large, long-running batch jobs for which MapReduce was originally ...
arxiv
Fast Matrix Multiplication with Big Sparse Data
Big Data becameabuzz word nowadays due to the evolution of huge volumes of data beyond peta bytes. This article focuses on matrix multiplication with big sparse data.
Somasekhar G., Karthikeyan K.
doaj +1 more source
MapReduce-Based D_ELT Framework to Address the Challenges of Geospatial Big Data
The conventional extracting−transforming−loading (ETL) system is typically operated on a single machine not capable of handling huge volumes of geospatial big data.
Junghee Jo, Kang-Woo Lee
doaj +1 more source
Evaluating MapReduce for Multi-core and Multiprocessor Systems [PDF]
Colby Ranger+4 more
openalex +1 more source
OS4M: Achieving Global Load Balance of MapReduce Workload by Scheduling at the Operation Level [PDF]
The efficiency of MapReduce is closely related to its load balance. Existing works on MapReduce load balance focus on coarse-grained scheduling. This study concerns fine-grained scheduling on MapReduce operations, with each operation representing one invocation of the Map or Reduce function.
arxiv