Apache hive - Open Access .click

Results 41 to 50 of about 5,534 (158)

Rumble: Data Independence for Large Messy Data Sets

, 2020
This paper introduces Rumble, an engine that executes JSONiq queries on large, heterogeneous and nested collections of JSON objects, leveraging the parallel capabilities of Spark so as to provide a high degree of data independence. The design is based on
Alonso, Gustavo +4 more
core +1 more source

A Fuzzy TOPSIS Approach for Big Data Analytics Platform Selection [PDF]

Journal of Advances in Computer Engineering and Technology, 2019
Big data sizes are constantly increasing. Big data analytics is where advanced analytic techniques are applied on big data sets. Analytics based on large data samples reveals and leverages business change.
Salah Uddin +4 more
doaj

Fraud Detection Framework for Blockchain Finance: Tackling Arbitrage, Liquidity Exploits, and Money Laundering

International Journal of Intelligent Systems, Volume 2026, Issue 1, 2026.
Blockchain technology has revolutionized numerous industries by providing decentralized, transparent, and immutable ledgers. However, its adoption is hindered by persistent security challenges, including arbitrage attacks, liquidity exploits, and noncompliance with antimoney laundering (AML) regulations.
Aleaddin Ozer, Murat Aydos, Richard Murray +2 more
wiley +1 more source

Primjena MapReduce algoritma na analizu nizova tekstualnih podataka [PDF]

, 2014
U ovom radu je opisan način rada i primjena Apache Hadoop-a i njegovih komponenti. Najvažnija komponenta je MapReduce koja ima sve veću primjenu. Da bismo mogli koristiti MapReduce algoritam potrebno je razumjeti njegov način rada, te naučiti neka ...
Volarić, Karolina
core +2 more sources

ReStore: Reusing Results of MapReduce Jobs [PDF]

, 2012
Analyzing large scale data has emerged as an important activity for many organizations in the past few years. This large scale data analysis is facilitated by the MapReduce programming and execution model and its implementations, most notably Hadoop ...
Aboulnaga, Ashraf, Elghandour, Iman
core +4 more sources

A Systematic Overview of Caching Mechanisms to Improve Hadoop Performance

Concurrency and Computation: Practice and Experience, Volume 37, Issue 25-26, 30 November 2025.
ABSTRACT In today's distributed computing environments, the rapid generation of large‐scale data from diverse sources poses significant challenges in terms of storage, management, and processing, particularly for traditional relational databases. Hadoop has emerged as a widely adopted framework for handling such data through parallel processing across ...
Rana Ghazali, Douglas G. Down
wiley +1 more source

Bellwethers: A Baseline Method For Transfer Learning

, 2018
Software analytics builds quality prediction models for software projects. Experience shows that (a) the more projects studied, the more varied are the conclusions; and (b) project managers lose faith in the results of software analytics if those results
Krishna, Rahul, Menzies, Tim
core +1 more source

AsterixDB: A Scalable, Open Source BDMS [PDF]

, 2014
AsterixDB is a new, full-function BDMS (Big Data Management System) with a feature set that distinguishes it from other platforms in today's open source Big Data ecosystem.
Alsubaiee, Sattam +22 more
core +1 more source

BestConfig: Tapping the Performance Potential of Systems via Automatic Configuration Tuning

, 2017
An ever increasing number of configuration parameters are provided to system users. But many users have used one configuration setting across different workloads, leaving untapped the performance potential of systems.
Bao, Yungang +7 more
core +1 more source

Apache Hive vs. Shark

, 2016
This paper discusses the evolution of Apache Hive and Shark.We take a look on the design modification made to existing systems for leveraging higher efficiency and performance benefits.The paper finally concludes with a brief discussion on the future of the aforementioned technologies.
openaire +2 more sources

big data
hadoop
hive

data processing
apache pig