Apache hadoop - Open Access .click

Results 61 to 70 of about 14,382 (202)

Integrating R and Hadoop for Big Data Analysis [PDF]

, 2014
Analyzing and working with big data could be very diffi cult using classical means like relational database management systems or desktop software packages for statistics and visualization.
Dragoescu, Raluca Mariana, Oancea, Bogdan +1 more
core +1 more source

A Systematic Overview of Caching Mechanisms to Improve Hadoop Performance

Concurrency and Computation: Practice and Experience, Volume 37, Issue 25-26, 30 November 2025.
ABSTRACT In today's distributed computing environments, the rapid generation of large‐scale data from diverse sources poses significant challenges in terms of storage, management, and processing, particularly for traditional relational databases. Hadoop has emerged as a widely adopted framework for handling such data through parallel processing across ...
Rana Ghazali, Douglas G. Down
wiley +1 more source

Optimization of the Hadoop Platform for Distributed Computation [PDF]

, 2012
Tato diplomová práce se zabývá možnostmi optimalizace frameworku Hadoop za pomocí platformy CUDA. Apache Hadoop je frameworku umožnující analýzu obrovských objemů dat.
Čecho, Jaroslav
core

Text Mining with Apache Hadoop over different Hadoop Clusters Architectures

International Journal of Recent Technology and Engineering (IJRTE), 2019
Big data is very much practical for real time applicational systems. One of the mostly used real time application worldwide are on unstructured documents. Large number of documents are managed and maintained through popular leadingBig Data platform is Hadoop. It maintains all the information at Hadoop Distributed File System in Blocks.
E. Laxmi Lydia +4 more
openaire +1 more source

Analysis of data processing efficiency with use of Apache Hive and Apache Pig in Hadoop environment

Journal of Computer Sciences Institute
The aim of this paper is the analysis of data processing efficiency with use of Apache Hive and Apache Pig in Hadoop environment. The analysis was based on comparison between both mentioned tools with use of large data set, represented by 28 million ...
Mikołaj Skrzypczyński, Piotr Muryjas
doaj +1 more source

Software analysis of scientific texts: comparative study of distributed computing frameworks

Радіоелектронні і комп'ютерні системи
The relevance of this study is related to the need for efficient analysis of scientific texts in the context of the growing amount of information.
Serik Altynbek +3 more
doaj +1 more source

CloudDOE: a user-friendly tool for deploying Hadoop clouds and analyzing high-throughput sequencing data with MapReduce. [PDF]

PLoS ONE, 2014
BackgroundExplosive growth of next-generation sequencing data has resulted in ultra-large-scale data sets and ensuing computational problems. Cloud computing provides an on-demand and scalable environment for large-scale data analysis.
Wei-Chun Chung +9 more
doaj +1 more source

GiViP: A Visual Profiler for Distributed Graph Processing Systems

, 2017
Analyzing large-scale graphs provides valuable insights in different application scenarios. While many graph processing systems working on top of distributed infrastructures have been proposed to deal with big graphs, the tasks of profiling and debugging
A Arleo +36 more
core +1 more source

Current State of Accountants' Knowledge of Digital Technologies: Evidence From Australia and New Zealand

Accounting &Finance, Volume 65, Issue 3, Page 2649-2664, September 2025.
ABSTRACT Using survey research, we investigate accountants' self‐rated knowledge of a variety of digital technologies (DTs). We find that accountants' self‐rated knowledge of established DTs is almost in line with IES2 requirements, but their self‐rated knowledge of emerging DTs is significantly below IES2 requirements. Of greater concern, we find that
Richard Busulwa +3 more
wiley +1 more source

A Survey on Internet of Medical Things (IoMT): Enabling Technologies, Security and Explainability Issues, Challenges, and Future Directions

Expert Systems, Volume 42, Issue 5, May 2025.
ABSTRACT Internet of Medical Things (IoMT) paradigm refers to the process of collection, transmission and analysis of healthcare data using communication and information systems over the internet. IoMT consist of medical devices that can link to the internet or other networks, including wearables, sensors, monitoring tools and other medical appliances.
Mert Melih Ozcelik, Ibrahim Kok, Suat Ozdemir +2 more
wiley +1 more source

big data
hadoop
apache spark

mapreduce
hdfs
spark

apache hive
hive