Results 151 to 160 of about 14,382 (202)
Some of the next articles are maybe not open access.

Related searches:

Using Apache Hadoop

2016
Apache Hadoop is the de facto framework for processing large data sets. Apache Hadoop is a distributed software application that runs across several (up to hundreds and thousands) of nodes across a cluster. Apache Hadoop comprises of two main components: Hadoop Distributed File System (HDFS) and MapReduce.
openaire   +3 more sources

Using Apache Hadoop Ecosystem

2016
Apache Hadoop has evolved to be the de facto framework for processing large quantities of data. Apache Hadoop ecosystem consists of a several projects including Apache Hive and Apache HBase. The Docker image "svds/cdh" is based on the latest CDH release and includes all the main frameworks in the Apache Hadoop ecosystem.
openaire   +3 more sources

Apache Hadoop YARN

Proceedings of the 4th annual Symposium on Cloud Computing, 2013
The initial design of Apache Hadoop [1] was tightly focused on running massive, MapReduce jobs to process a web crawl. For increasingly diverse companies, Hadoop has become the data and computational agora---the de facto place where data and computational resources are shared and accessed.
Vinod Kumar Vavilapalli   +15 more
openaire   +1 more source

Clustering with Apache Hadoop

Proceedings of the International Conference & Workshop on Emerging Trends in Technology - ICWET '11, 2011
The self-organizing map (SOM) is an unsupervised neural network which projects high-dimensional data onto a low-dimensional grid and visually reveals the topological order of the original data. Thus, SOM is an excellent tool in the exploratory phase of data mining.
S. Nair, J. Mehta
openaire   +1 more source

Performance comparison of Apache Hadoop and Apache Spark

Proceedings of the Third International Conference on Advanced Informatics for Computing Research, 2019
The term 'Big Data' is a broad term used for the data sets, which is enormous and traditional data processing applications find it hard to process. Both Apache Spark and Apache Hadoop are one of the significant parts of the big data family. Some of the researchers view both frameworks as the rivals but it is not that easy to compare these two as they ...
Amritpal Singh   +2 more
openaire   +1 more source

Apache hadoop goes realtime at Facebook

Proceedings of the 2011 ACM SIGMOD International Conference on Management of data, 2011
Facebook recently deployed Facebook Messages, its first ever user-facing application built on the Apache Hadoop platform. Apache HBase is a database-like layer built on Hadoop designed to support billions of messages per day. This paper describes the reasons why Facebook chose Hadoop and HBase over other systems such as Apache Cassandra and Voldemort ...
Dhruba Borthakur   +11 more
openaire   +1 more source

Home - About - Disclaimer - Privacy