Results 21 to 30 of about 14,382 (202)
Performance Optimization System for Hadoop and Spark Frameworks
The optimization of large-scale data sets depends on the technologies and methods used. The MapReduce model, implemented on Apache Hadoop or Spark, allows splitting large data sets into a set of blocks distributed on several machines.
Astsatryan Hrachya +3 more
doaj +1 more source
A method for integrating GIS and big data platforms [PDF]
Geographic Information System (GIS) has been played an important role in many applications of our daily life since 1970. Recently, with the rapid development of new technologies, earth’s data increases explosively. Many studies have been proposed to
Hong Le
doaj +1 more source
Controlling Network Latency in Mixed Hadoop Clusters: Do We Need Active Queue Management? [PDF]
With the advent of big data, data center applications are processing vast amounts of unstructured and semi-structured data, in parallel on large clusters, across hundreds to thousands of nodes.
Carpenter, Paul M. +1 more
core +1 more source
Hadoop Performance Analysis Model with Deep Data Locality [PDF]
Background: Hadoop has become the base framework on the big data system via the simple concept that moving computation is cheaper than moving data. Hadoop increases a data locality in the Hadoop Distributed File System (HDFS) to improve the performance ...
Jo, Ju-Yeon, Kim, Yoohwan, Lee, Sungchul
core +2 more sources
Database integrated analytics using R : initial experiences with SQL-Server + R [PDF]
© 2016 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new ...
Berral, Josep Ll., Poggi, Nicolas
core +2 more sources
Large Scale Citation Matching Using Apache Hadoop [PDF]
During the process of citation matching links from bibliography entries to referenced publications are created. Such links are indicators of topical similarity between linked texts, are used in assessing the impact of the referenced document and improve navigation in the user interfaces of digital libraries. In this paper we present a citation matching
Fedoryszak, Mateusz +2 more
openaire +2 more sources
A Three-Tier Authentication Scheme for Kerberized Hadoop Environment
Apache Hadoop answers the quest of handling Bigdata for most organizations. It offers distributed storage and data analysis via Hadoop Distributed File System (HDFS) and Map-Reduce frameworks.
Hena M., Jeyanthi N.
doaj +1 more source
The article presents a detailed comparative analysis of the performance of a Microsoft SQL Server relational database and an Apache Hadoop environment in the context of analytical data processing.
Michał Zadrąg
doaj +1 more source
H-word: Supporting job scheduling in Hadoop with workload-driven data redistribution [PDF]
The final publication is available at http://link.springer.com/chapter/10.1007/978-3-319-44039-2_21Today’s distributed data processing systems typically follow a query shipping approach and exploit data locality for reducing network traffic.
Abelló Gamazo, Alberto +3 more
core +1 more source
Developing a monitoring system for Cloud-based distributed data-centers [PDF]
Nowadays more and more datacenters cooperate each others to achieve a common and more complex goal. New advanced functionalities are required to support experts during recovery and managing activities, like anomaly detection and fault pattern recognition.
Elia Domenico +3 more
doaj +1 more source

