Results 211 to 220 of about 9,555 (241)
Some of the next articles are maybe not open access.

Hadoop and MapReduce

2016
In this chapter we consider situations in which a single host computer is inadequate because the data volume or processing demand exceeds the capacity of the host. A popular solution distributes the data and computations across a network of computers or a short-lived network created for the task (a cluster).
Brian Steele   +2 more
openaire   +2 more sources

A Quest on Hadoop

2013
Everyday quintillion bytes of data are created. About 90% of this data which are posts to social media sites, digital pictures, videos etc. are unstructured. These data is BigData and should be formatted to make it suitable for data mining and its subsequent analysis.
Daniel, Suman Elizabeth, A, Binu
openaire   +1 more source

Hadoop-MCC: Efficient Multiple Compound Comparison Algorithm Using Hadoop

Combinatorial Chemistry & High Throughput Screening, 2018
Aim and Objective: In the past decade, the drug design technologies have been improved enormously. The computer-aided drug design (CADD) has played an important role in analysis and prediction in drug development, which makes the procedure more economical and efficient.
Guan Jie Hua   +3 more
openaire   +3 more sources

Beyond Hadoop

Communications of the ACM, 2013
The leading open source system for processing big data continues to evolve, but new approaches with added features are on the rise.
openaire   +2 more sources

Hadoop in the Cloud

2014
Hadoop requires commodity cluster hardware to operate. One solution is to design a cluster, procure hardware, select a distribution, install Hadoop, and administer the cluster in-house. Some vendors deliver a completely configured cluster based on customer specifications, but the jobs of administration, maintenance, and upgrading remain. Installing and
Sameer Wadkar, Madhu Siddalingaiah
openaire   +2 more sources

Autoscaling for Hadoop Clusters

2016 IEEE International Conference on Cloud Engineering (IC2E), 2016
Unforeseen events such as node failures and resource contention can have a severe impact on the performance of data processing frameworks, such as Hadoop, especially in cloud environments where such incidents are common. SLA compliance in the presence of such events requires the ability to quickly and dynamically resize infrastructure resources ...
Andrzej Kochut   +4 more
openaire   +2 more sources

Encryption in Hadoop

2014
Recently, I was talking with a friend about possibly using Hadoop to speed up reporting on his company’s “massive” data warehouse of 4TB. (He heads the IT department of one of the biggest real estate companies in the Chicago area.) Although he grudgingly agreed to a possible performance benefit, he asked very confidently, “But what about encrypting our
openaire   +2 more sources

Analytics with Hadoop

2014
Analytics is the process of finding significance in data, meaning that it can support decision making. Decision makers turning to Hadoop’s data for their answers will find numerous analytics options. For example, Hadoop-based databases like Apache Hive and Cloudera Impala offer SQL-like interfaces with HDFS-based data.
openaire   +2 more sources

Hadoop superlinear scalability

Communications of the ACM, 2015
The perpetual motion of parallel performance.
Kristofer Tomasette   +2 more
openaire   +2 more sources

Reporting with Hadoop

2014
Because the potential storage capability of a Hadoop cluster is so very large, you need some means to track both the data contained on the cluster and the data feeds moving data into and out of it. In addition, you need to consider the locations where data might reside on the cluster—that is, in HDFS, Hive, HBase, or Impala.
openaire   +2 more sources

Home - About - Disclaimer - Privacy