Results 211 to 220 of about 9,555 (241)
Some of the next articles are maybe not open access.
2016
In this chapter we consider situations in which a single host computer is inadequate because the data volume or processing demand exceeds the capacity of the host. A popular solution distributes the data and computations across a network of computers or a short-lived network created for the task (a cluster).
Brian Steele+2 more
openaire +2 more sources
In this chapter we consider situations in which a single host computer is inadequate because the data volume or processing demand exceeds the capacity of the host. A popular solution distributes the data and computations across a network of computers or a short-lived network created for the task (a cluster).
Brian Steele+2 more
openaire +2 more sources
2013
Everyday quintillion bytes of data are created. About 90% of this data which are posts to social media sites, digital pictures, videos etc. are unstructured. These data is BigData and should be formatted to make it suitable for data mining and its subsequent analysis.
Daniel, Suman Elizabeth, A, Binu
openaire +1 more source
Everyday quintillion bytes of data are created. About 90% of this data which are posts to social media sites, digital pictures, videos etc. are unstructured. These data is BigData and should be formatted to make it suitable for data mining and its subsequent analysis.
Daniel, Suman Elizabeth, A, Binu
openaire +1 more source
Hadoop-MCC: Efficient Multiple Compound Comparison Algorithm Using Hadoop
Combinatorial Chemistry & High Throughput Screening, 2018Aim and Objective: In the past decade, the drug design technologies have been improved enormously. The computer-aided drug design (CADD) has played an important role in analysis and prediction in drug development, which makes the procedure more economical and efficient.
Guan Jie Hua+3 more
openaire +3 more sources
Communications of the ACM, 2013
The leading open source system for processing big data continues to evolve, but new approaches with added features are on the rise.
openaire +2 more sources
The leading open source system for processing big data continues to evolve, but new approaches with added features are on the rise.
openaire +2 more sources
2014
Hadoop requires commodity cluster hardware to operate. One solution is to design a cluster, procure hardware, select a distribution, install Hadoop, and administer the cluster in-house. Some vendors deliver a completely configured cluster based on customer specifications, but the jobs of administration, maintenance, and upgrading remain. Installing and
Sameer Wadkar, Madhu Siddalingaiah
openaire +2 more sources
Hadoop requires commodity cluster hardware to operate. One solution is to design a cluster, procure hardware, select a distribution, install Hadoop, and administer the cluster in-house. Some vendors deliver a completely configured cluster based on customer specifications, but the jobs of administration, maintenance, and upgrading remain. Installing and
Sameer Wadkar, Madhu Siddalingaiah
openaire +2 more sources
Autoscaling for Hadoop Clusters
2016 IEEE International Conference on Cloud Engineering (IC2E), 2016Unforeseen events such as node failures and resource contention can have a severe impact on the performance of data processing frameworks, such as Hadoop, especially in cloud environments where such incidents are common. SLA compliance in the presence of such events requires the ability to quickly and dynamically resize infrastructure resources ...
Andrzej Kochut+4 more
openaire +2 more sources
2014
Recently, I was talking with a friend about possibly using Hadoop to speed up reporting on his company’s “massive” data warehouse of 4TB. (He heads the IT department of one of the biggest real estate companies in the Chicago area.) Although he grudgingly agreed to a possible performance benefit, he asked very confidently, “But what about encrypting our
openaire +2 more sources
Recently, I was talking with a friend about possibly using Hadoop to speed up reporting on his company’s “massive” data warehouse of 4TB. (He heads the IT department of one of the biggest real estate companies in the Chicago area.) Although he grudgingly agreed to a possible performance benefit, he asked very confidently, “But what about encrypting our
openaire +2 more sources
2014
Analytics is the process of finding significance in data, meaning that it can support decision making. Decision makers turning to Hadoop’s data for their answers will find numerous analytics options. For example, Hadoop-based databases like Apache Hive and Cloudera Impala offer SQL-like interfaces with HDFS-based data.
openaire +2 more sources
Analytics is the process of finding significance in data, meaning that it can support decision making. Decision makers turning to Hadoop’s data for their answers will find numerous analytics options. For example, Hadoop-based databases like Apache Hive and Cloudera Impala offer SQL-like interfaces with HDFS-based data.
openaire +2 more sources
Hadoop superlinear scalability
Communications of the ACM, 2015The perpetual motion of parallel performance.
Kristofer Tomasette+2 more
openaire +2 more sources
2014
Because the potential storage capability of a Hadoop cluster is so very large, you need some means to track both the data contained on the cluster and the data feeds moving data into and out of it. In addition, you need to consider the locations where data might reside on the cluster—that is, in HDFS, Hive, HBase, or Impala.
openaire +2 more sources
Because the potential storage capability of a Hadoop cluster is so very large, you need some means to track both the data contained on the cluster and the data feeds moving data into and out of it. In addition, you need to consider the locations where data might reside on the cluster—that is, in HDFS, Hive, HBase, or Impala.
openaire +2 more sources