Results 31 to 40 of about 14,382 (202)
Apache Mahout’s k-Means vs. fuzzy k-Means performance evaluation [PDF]
(c) 2016 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or ...
Barolli, Leonard +3 more
core +1 more source
Apache Hadoop-MapReduce on YARN framework latency
Abstract Big Data is currently a fertile field for researchers and scientific companies around the world, due to the emergence of new technologies, Internet of Things (IoT) and means of communication such as social networking sites, which has led to a notable increase in the amount of data produced each day.
Abdelaziz EL YAZIDI +3 more
openaire +1 more source
BigData Analysis in Healthcare: Apache Hadoop , Apache spark and Apache Flink
Introduction: Health care data is increasing. The correct analysis of such data will improve the quality of care and reduce costs. This kind of data has certain features such as high volume, variety, high-speed production, etc. It makes it impossible to analyze with ordinary hardware and software platforms. Choosing the right platform for managing this
Elham Nazari +2 more
openaire +2 more sources
A Fuzzy TOPSIS Approach for Big Data Analytics Platform Selection [PDF]
Big data sizes are constantly increasing. Big data analytics is where advanced analytic techniques are applied on big data sets. Analytics based on large data samples reveals and leverages business change.
Salah Uddin +4 more
doaj
In this paper, file formats like Avro and Parquet are compared with text formats to evaluate the performance of the data queries. Different data query patterns have been evaluated. Cloudera’s open-source Apache Hadoop distribution CDH 5.4 has been chosen
Daiga Plase +2 more
doaj +1 more source
Advancing Organic Chemistry Using High‐Throughput Experimentation
This review outlines major advances in the design, execution, analysis, and data management phases of high‐throughput experimentation (HTE). The limitations and potential opportunities of applying modern HTE to organic synthesis are highlighted. Abstract High‐throughput experimentation (HTE), the miniaturization and parallelization of reactions, is a ...
Reem Nsouli +2 more
wiley +2 more sources
A Game-Theoretic Approach for Runtime Capacity Allocation in MapReduce [PDF]
Nowadays many companies have available large amounts of raw, unstructured data. Among Big Data enabling technologies, a central place is held by the MapReduce framework and, in particular, by its open source implementation, Apache Hadoop.
Ardagna, Danilo +3 more
core +3 more sources
Challenging SQL-on-Hadoop Performance with Apache Druid [PDF]
In Big Data, SQL-on-Hadoop tools usually provide satisfactory performance for processing vast amounts of data, although new emerging tools may be an alternative. This paper evaluates if Apache Druid, an innovative column-oriented data store suited for online analytical processing workloads, is an alternative to some of the well-known SQL-on-Hadoop ...
José Correia +2 more
openaire +2 more sources
Real-time Twitter data analysis using Hadoop ecosystem
In the era of the Internet, social media has become an integral part of modern society. People use social media to share their opinions and to have an up-to-date knowledge about the current trends on a daily basis.
Anisha P. Rodrigues +1 more
doaj +1 more source
Deploying Apache Spark virtual clusters in cloud environments using orchestration technologies
Apache Spark is a framework providing fast computations on Big Data using MapReduce model. With cloud environments Big Data processing becomes more flexible since they allow to create virtual clusters on-demand. One of the most powerful open-source cloud
O. . Borisenko +2 more
doaj +1 more source

