Results 61 to 70 of about 14,382 (202)
Integrating R and Hadoop for Big Data Analysis [PDF]
Analyzing and working with big data could be very diffi cult using classical means like relational database management systems or desktop software packages for statistics and visualization.
Dragoescu, Raluca Mariana +1 more
core +1 more source
A Systematic Overview of Caching Mechanisms to Improve Hadoop Performance
ABSTRACT In today's distributed computing environments, the rapid generation of large‐scale data from diverse sources poses significant challenges in terms of storage, management, and processing, particularly for traditional relational databases. Hadoop has emerged as a widely adopted framework for handling such data through parallel processing across ...
Rana Ghazali, Douglas G. Down
wiley +1 more source
Optimization of the Hadoop Platform for Distributed Computation [PDF]
Tato diplomová práce se zabývá možnostmi optimalizace frameworku Hadoop za pomocí platformy CUDA. Apache Hadoop je frameworku umožnující analýzu obrovských objemů dat.
Čecho, Jaroslav
core
Text Mining with Apache Hadoop over different Hadoop Clusters Architectures
Big data is very much practical for real time applicational systems. One of the mostly used real time application worldwide are on unstructured documents. Large number of documents are managed and maintained through popular leadingBig Data platform is Hadoop. It maintains all the information at Hadoop Distributed File System in Blocks.
E. Laxmi Lydia +4 more
openaire +1 more source
Analysis of data processing efficiency with use of Apache Hive and Apache Pig in Hadoop environment
The aim of this paper is the analysis of data processing efficiency with use of Apache Hive and Apache Pig in Hadoop environment. The analysis was based on comparison between both mentioned tools with use of large data set, represented by 28 million ...
Mikołaj Skrzypczyński, Piotr Muryjas
doaj +1 more source
Software analysis of scientific texts: comparative study of distributed computing frameworks
The relevance of this study is related to the need for efficient analysis of scientific texts in the context of the growing amount of information.
Serik Altynbek +3 more
doaj +1 more source
CloudDOE: a user-friendly tool for deploying Hadoop clouds and analyzing high-throughput sequencing data with MapReduce. [PDF]
BackgroundExplosive growth of next-generation sequencing data has resulted in ultra-large-scale data sets and ensuing computational problems. Cloud computing provides an on-demand and scalable environment for large-scale data analysis.
Wei-Chun Chung +9 more
doaj +1 more source
GiViP: A Visual Profiler for Distributed Graph Processing Systems
Analyzing large-scale graphs provides valuable insights in different application scenarios. While many graph processing systems working on top of distributed infrastructures have been proposed to deal with big graphs, the tasks of profiling and debugging
A Arleo +36 more
core +1 more source
ABSTRACT Using survey research, we investigate accountants' self‐rated knowledge of a variety of digital technologies (DTs). We find that accountants' self‐rated knowledge of established DTs is almost in line with IES2 requirements, but their self‐rated knowledge of emerging DTs is significantly below IES2 requirements. Of greater concern, we find that
Richard Busulwa +3 more
wiley +1 more source
ABSTRACT Internet of Medical Things (IoMT) paradigm refers to the process of collection, transmission and analysis of healthcare data using communication and information systems over the internet. IoMT consist of medical devices that can link to the internet or other networks, including wearables, sensors, monitoring tools and other medical appliances.
Mert Melih Ozcelik +2 more
wiley +1 more source

