Results 201 to 210 of about 41,447 (233)
Some of the next articles are maybe not open access.
International Journal of Cloud Applications and Computing, 2011
As a main subfield of cloud computing applications, internet services require large-scale data computing. Their workloads can be divided into two classes: customer-facing query-processing interactive tasks that serve hundreds of millions of users within a short response time and backend data analysis batch tasks that involve petabytes of data.
Zhiwei Xu, Bo Yan, Yongqiang Zou
openaire +1 more source
As a main subfield of cloud computing applications, internet services require large-scale data computing. Their workloads can be divided into two classes: customer-facing query-processing interactive tasks that serve hundreds of millions of users within a short response time and backend data analysis batch tasks that involve petabytes of data.
Zhiwei Xu, Bo Yan, Yongqiang Zou
openaire +1 more source
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining, 2013
From it's beginnings as a framework for building web crawlers for small-scale search engines to being one of the most promising technologies for building datacenter-scale distributed computing and storage platforms, Apache Hadoop has come far in the last seven years.
openaire +1 more source
From it's beginnings as a framework for building web crawlers for small-scale search engines to being one of the most promising technologies for building datacenter-scale distributed computing and storage platforms, Apache Hadoop has come far in the last seven years.
openaire +1 more source
Proceedings of the International Conference on Research in Adaptive and Convergent Systems, 2016
Hadoop is a widely used software framework for handling massive data. As heterogeneous computing gains its momentum, variants of Hadoop have been developed to offload the computation of the Hadoop applications onto the heterogeneous processors, such as GPUs, DSPs, and FPGA. Unfortunately, these variants do not support on-demand resource scaling for the
Yi-Wei Chen +3 more
openaire +1 more source
Hadoop is a widely used software framework for handling massive data. As heterogeneous computing gains its momentum, variants of Hadoop have been developed to offload the computation of the Hadoop applications onto the heterogeneous processors, such as GPUs, DSPs, and FPGA. Unfortunately, these variants do not support on-demand resource scaling for the
Yi-Wei Chen +3 more
openaire +1 more source
Proceedings of the 8th ACM International Conference on Computing Frontiers, 2011
The information-technology platform is being radically transformed with the widespread adoption of the cloud computing model supported by data centers containing large numbers of multicore servers. While cloud computing platforms can potentially enable a rich variety of distributed applications, the need to exploit multiscale parallelism at the inter ...
Riyaz Haque +2 more
openaire +1 more source
The information-technology platform is being radically transformed with the widespread adoption of the cloud computing model supported by data centers containing large numbers of multicore servers. While cloud computing platforms can potentially enable a rich variety of distributed applications, the need to exploit multiscale parallelism at the inter ...
Riyaz Haque +2 more
openaire +1 more source
Proceedings of the 4th Workshop on Workflows in Support of Large-Scale Science, 2009
MapReduce provides a parallel and scalable programming model for data-intensive business and scientific applications. MapReduce and its de facto open source project, called Hadoop, support parallel processing on large datasets with capabilities including automatic data partitioning and distribution, load balancing, and fault tolerance management ...
Jianwu Wang +2 more
openaire +1 more source
MapReduce provides a parallel and scalable programming model for data-intensive business and scientific applications. MapReduce and its de facto open source project, called Hadoop, support parallel processing on large datasets with capabilities including automatic data partitioning and distribution, load balancing, and fault tolerance management ...
Jianwu Wang +2 more
openaire +1 more source
Proceedings of the VLDB Endowment, 2013
We analyze Hadoop workloads from three di?erent research clusters from a user-centric perspective. The goal is to better understand data scientists' use of the system and how well the use of the system matches its design. Our analysis suggests that Hadoop usage is still in its adolescence. We see underuse of Hadoop features, extensions, and tools.
Kai Ren +3 more
openaire +1 more source
We analyze Hadoop workloads from three di?erent research clusters from a user-centric perspective. The goal is to better understand data scientists' use of the system and how well the use of the system matches its design. Our analysis suggests that Hadoop usage is still in its adolescence. We see underuse of Hadoop features, extensions, and tools.
Kai Ren +3 more
openaire +1 more source
Proceedings of the 2017 ACM International Conference on Management of Data, 2017
This paper presents ST-Hadoop; the first full-fledged open-source MapReduce framework with a native support for spatio-temporal data. ST-Hadoop is a comprehensive extension to Hadoop and SpatialHadoop that injects spatio-temporal data awareness inside each of their layers, mainly, language, indexing, and operations layers.
openaire +1 more source
This paper presents ST-Hadoop; the first full-fledged open-source MapReduce framework with a native support for spatio-temporal data. ST-Hadoop is a comprehensive extension to Hadoop and SpatialHadoop that injects spatio-temporal data awareness inside each of their layers, mainly, language, indexing, and operations layers.
openaire +1 more source
Communications of the ACM, 2013
The leading open source system for processing big data continues to evolve, but new approaches with added features are on the rise.
openaire +1 more source
The leading open source system for processing big data continues to evolve, but new approaches with added features are on the rise.
openaire +1 more source
Proceedings of the 2013 companion publication for conference on Systems, programming, & applications: software for humanity, 2013
We introduces HabaneroJava-Hadoop, an extension to the HadoopMapReduce system that is optimized for multi-core machines. HJ-Hadoop exploits intra-JVM parallelism that increases memory efficiency of each node. Results show a significant improvement in the amount of data each MapReduce job could process and load balance across cores for certain ...
openaire +1 more source
We introduces HabaneroJava-Hadoop, an extension to the HadoopMapReduce system that is optimized for multi-core machines. HJ-Hadoop exploits intra-JVM parallelism that increases memory efficiency of each node. Results show a significant improvement in the amount of data each MapReduce job could process and load balance across cores for certain ...
openaire +1 more source

