Results 1 to 10 of about 41,447 (233)

CloudDOE: a user-friendly tool for deploying Hadoop clouds and analyzing high-throughput sequencing data with MapReduce. [PDF]

open access: yesPLoS ONE, 2014
BackgroundExplosive growth of next-generation sequencing data has resulted in ultra-large-scale data sets and ensuing computational problems. Cloud computing provides an on-demand and scalable environment for large-scale data analysis.
Wei-Chun Chung   +9 more
doaj   +1 more source

Optimization of Computing and Networking Resources of a Hadoop Cluster Based on Software Defined Network

open access: yesIEEE Access, 2018
In this paper, we discuss some challenges regarding the Hadoop framework. One of the main ones is the computing performance of Hadoop MapReduce jobs in terms of CPU, memory, and hard disk I/O. The networking side of a Hadoop cluster is another challenge,
Ali Khaleel, Hamed Al-Raweshidy
doaj   +1 more source

Hadoop Technology

open access: yesInternational Journal of Scientific Research in Science, Engineering and Technology, 2022
'Big Data 'describes and technologies to store, distribute, manage and analyze large-sized datasets with high-velocity. Big data can be structured, unstructured or semi-structured, resulting in incapability of conventional data management methods. Data is generated from various different sources and can arrive in the system at various rates. In order
null Yash Patel, null Prof. Manish Joshi
openaire   +1 more source

Attribute based honey encryption algorithm for securing big data: Hadoop distributed file system perspective [PDF]

open access: yesPeerJ Computer Science, 2020
Hadoop has become a promising platform to reliably process and store big data. It provides flexible and low cost services to huge data through Hadoop Distributed File System (HDFS) storage.
Gayatri Kapil   +5 more
doaj   +2 more sources

A Tale of Two Data-Intensive Paradigms: Applications, Abstractions, and Architectures [PDF]

open access: yes, 2014
Scientific problems that depend on processing large amounts of data require overcoming challenges in multiple areas: managing large-scale data distribution, co-placement and scheduling of data with compute resources, and storing and transferring large ...
Fox, Geoffrey C.   +4 more
core   +1 more source

Enhanced Failure Detection Mechanism in MapReduce [PDF]

open access: yes, 2012
The popularity of MapReduce programming model has increased interest in the research community for its improvement. Among the other directions, the point of fault tolerance, concretely the failure detection issue seems to be a crucial one, but that until
Antoniu, Gabriel   +2 more
core   +4 more sources

OEHadoop: Accelerate Hadoop Applications by Co-Designing Hadoop With Data Center Network

open access: yesIEEE Access, 2018
Big data applications in Hadoop usually cause heavy bandwidth demand and network bottleneck in the current data center network (DCN). On one hand, the design of DCN does not take the traffic demand and the traffic patterns of Hadoop applications into ...
Yinan Tang   +7 more
doaj   +1 more source

A Parallel High-Utility Itemset Mining Algorithm Based on Hadoop

open access: yesComplex System Modeling and Simulation, 2023
High-utility itemset mining (HUIM) can consider not only the profit factor but also the profitable factor, which is an essential task in data mining. However, most HUIM algorithms are mainly developed on a single machine, which is inefficient for big ...
Zaihe Cheng   +3 more
doaj   +1 more source

An Intrusive Analyzer for Hadoop Systems Based on Wireless Sensor Networks

open access: yesInternational Journal of Distributed Sensor Networks, 2014
Owing to the acceleration of IoT- (Internet of Things-) based wireless sensor networks, cloud-computing services using Big Data are rapidly growing. In order to manage and analyze Big Data efficiently, Hadoop frameworks have been used in a variety of ...
Byoung-Jin Bae   +4 more
doaj   +1 more source

Spark deployment and performance evaluation on the MareNostrum supercomputer [PDF]

open access: yes, 2015
In this paper we present a framework to enable data-intensive Spark workloads on MareNostrum, a petascale supercomputer designed mainly for compute-intensive applications.
Ayguadé Parra, Eduard   +9 more
core   +1 more source

Home - About - Disclaimer - Privacy