Results 11 to 20 of about 41,428 (216)
CloudDOE: a user-friendly tool for deploying Hadoop clouds and analyzing high-throughput sequencing data with MapReduce. [PDF]
BackgroundExplosive growth of next-generation sequencing data has resulted in ultra-large-scale data sets and ensuing computational problems. Cloud computing provides an on-demand and scalable environment for large-scale data analysis.
Wei-Chun Chung +9 more
doaj +1 more source
In this paper, we discuss some challenges regarding the Hadoop framework. One of the main ones is the computing performance of Hadoop MapReduce jobs in terms of CPU, memory, and hard disk I/O. The networking side of a Hadoop cluster is another challenge,
Ali Khaleel, Hamed Al-Raweshidy
doaj +1 more source
'Big Data 'describes and technologies to store, distribute, manage and analyze large-sized datasets with high-velocity. Big data can be structured, unstructured or semi-structured, resulting in incapability of conventional data management methods. Data is generated from various different sources and can arrive in the system at various rates. In order
null Yash Patel, null Prof. Manish Joshi
openaire +1 more source
Attribute based honey encryption algorithm for securing big data: Hadoop distributed file system perspective [PDF]
Hadoop has become a promising platform to reliably process and store big data. It provides flexible and low cost services to huge data through Hadoop Distributed File System (HDFS) storage.
Gayatri Kapil +5 more
doaj +2 more sources
A Tale of Two Data-Intensive Paradigms: Applications, Abstractions, and Architectures [PDF]
Scientific problems that depend on processing large amounts of data require overcoming challenges in multiple areas: managing large-scale data distribution, co-placement and scheduling of data with compute resources, and storing and transferring large ...
Fox, Geoffrey C. +4 more
core +1 more source
Enhanced Failure Detection Mechanism in MapReduce [PDF]
The popularity of MapReduce programming model has increased interest in the research community for its improvement. Among the other directions, the point of fault tolerance, concretely the failure detection issue seems to be a crucial one, but that until
Antoniu, Gabriel +2 more
core +4 more sources
OEHadoop: Accelerate Hadoop Applications by Co-Designing Hadoop With Data Center Network
Big data applications in Hadoop usually cause heavy bandwidth demand and network bottleneck in the current data center network (DCN). On one hand, the design of DCN does not take the traffic demand and the traffic patterns of Hadoop applications into ...
Yinan Tang +7 more
doaj +1 more source
A Parallel High-Utility Itemset Mining Algorithm Based on Hadoop
High-utility itemset mining (HUIM) can consider not only the profit factor but also the profitable factor, which is an essential task in data mining. However, most HUIM algorithms are mainly developed on a single machine, which is inefficient for big ...
Zaihe Cheng +3 more
doaj +1 more source
An Intrusive Analyzer for Hadoop Systems Based on Wireless Sensor Networks
Owing to the acceleration of IoT- (Internet of Things-) based wireless sensor networks, cloud-computing services using Big Data are rapidly growing. In order to manage and analyze Big Data efficiently, Hadoop frameworks have been used in a variety of ...
Byoung-Jin Bae +4 more
doaj +1 more source
Spark deployment and performance evaluation on the MareNostrum supercomputer [PDF]
In this paper we present a framework to enable data-intensive Spark workloads on MareNostrum, a petascale supercomputer designed mainly for compute-intensive applications.
Ayguadé Parra, Eduard +9 more
core +1 more source

