Results 11 to 20 of about 9,135 (230)

BSP vs MapReduce [PDF]

open access: yesProcedia Computer Science, 2012
The MapReduce framework has been generating a lot of interest in a wide range of areas. It has been widely adopted in industry and has been used to solve a number of non-trivial problems in academia. Putting MapReduce on strong theoretical foundations is crucial in understanding its capabilities.
arxiv   +4 more sources

Coded MapReduce [PDF]

open access: yesarXiv, 2015
MapReduce is a commonly used framework for executing data-intensive jobs on distributed server clusters. We introduce a variant implementation of MapReduce, namely "Coded MapReduce", to substantially reduce the inter-server communication load for the shuffling phase of MapReduce, and thus accelerating its execution.
Li, Songze   +2 more
arxiv   +3 more sources

Tiled-MapReduce [PDF]

open access: bronzeACM Transactions on Architecture and Code Optimization, 2013
The prevalence of chip multiprocessors opens opportunities of running data-parallel applications originally in clusters on a single machine with many cores. MapReduce, a simple and elegant programming model to program large-scale clusters, has recently been shown a promising alternative to harness the multicore platform.
Rong Chen, Haibo Chen
openalex   +3 more sources

The Efficiency of MapReduce in Parallel External Memory [PDF]

open access: greenarXiv, 2011
Since its introduction in 2004, the MapReduce framework has become one of the standard approaches in massive distributed and parallel computation. In contrast to its intensive use in practise, theoretical footing is still limited and only little work has been done yet to put MapReduce on a par with the major computational models. Following pioneer work
Gero Greiner, Riko Jacob
arxiv   +3 more sources

Meta-MapReduce: A Technique for Reducing Communication in MapReduce Computations [PDF]

open access: yesarXiv, 2015
MapReduce has proven to be one of the most useful paradigms in the revolution of distributed computing, where cloud services and cluster computing become the standard venue for computing. The federation of cloud and big data activities is the next challenge where MapReduce should be modified to avoid (big) data migration across remote (cloud) sites ...
Afrati, Foto   +3 more
arxiv   +3 more sources

Automatic Optimization for MapReduce Programs [PDF]

open access: yesProceedings of the VLDB Endowment (PVLDB), Vol. 4, No. 6, pp. 385-396 (2011), 2011
The MapReduce distributed programming framework has become popular, despite evidence that current implementations are inefficient, requiring far more hardware than a traditional relational databases to complete similar tasks. MapReduce jobs are amenable to many traditional database query optimizations (B+Trees for selections, column-store- style ...
Michael Cafarella   +2 more
arxiv   +5 more sources

Parallelization of Maximum Entropy POS Tagging for Bahasa Indonesia with MapReduce [PDF]

open access: greenarXiv, 2012
In this paper, MapReduce programming model is used to parallelize training and tagging proceess in Maximum Entropy part of speech tagging for Bahasa Indonesia. In training process, MapReduce model is implemented dictionary, tagtoken, and feature creation. In tagging process, MapReduce is implemented to tag lines of document in parallel.
Arif Nurwidyantoro, Edi Winarko
arxiv   +3 more sources

Big Data Technology Fusion Back Propagation Neural Network in Product Innovation Design Method

open access: yesIET Networks, Accepted Article., 2022
This research uses big data technology to combine the process of product innovation design method, which has certain significance for the formation of intelligent and systematic product innovation design method. Meanwhile, while predicting the results of all products innovative design methods, it can improve the product's predictive innovative design ...
Ren Li, Qiang Zeng
wiley   +1 more source

A distributed data processing scheme based on Hadoop for synchrotron radiation experiments. [PDF]

open access: yesJ Synchrotron Radiat
A set of distributed data processing schemes for beamlines with experimental data using Hadoop are presented.With the development of synchrotron radiation sources and high‐frame‐rate detectors, the amount of experimental data collected at synchrotron radiation beamlines has increased exponentially. As a result, data processing for synchrotron radiation
Zhang D   +6 more
europepmc   +2 more sources

Behavioral simulations in MapReduce [PDF]

open access: yesProceedings of the VLDB Endowment, 2010
In many scientific domains, researchers are turning to large-scale behavioral simulations to better understand real-world phenomena. While there has been a great deal of work on simulation tools from the high-performance computing community, behavioral simulations remain challenging to program and automatically scale in parallel environments.
Johannes Gehrke   +7 more
openaire   +3 more sources

Home - About - Disclaimer - Privacy