Results 41 to 50 of about 5,534 (158)
Rumble: Data Independence for Large Messy Data Sets
This paper introduces Rumble, an engine that executes JSONiq queries on large, heterogeneous and nested collections of JSON objects, leveraging the parallel capabilities of Spark so as to provide a high degree of data independence. The design is based on
Alonso, Gustavo +4 more
core +1 more source
A Fuzzy TOPSIS Approach for Big Data Analytics Platform Selection [PDF]
Big data sizes are constantly increasing. Big data analytics is where advanced analytic techniques are applied on big data sets. Analytics based on large data samples reveals and leverages business change.
Salah Uddin +4 more
doaj
Blockchain technology has revolutionized numerous industries by providing decentralized, transparent, and immutable ledgers. However, its adoption is hindered by persistent security challenges, including arbitrage attacks, liquidity exploits, and noncompliance with antimoney laundering (AML) regulations.
Aleaddin Ozer +2 more
wiley +1 more source
Primjena MapReduce algoritma na analizu nizova tekstualnih podataka [PDF]
U ovom radu je opisan način rada i primjena Apache Hadoop-a i njegovih komponenti. Najvažnija komponenta je MapReduce koja ima sve veću primjenu. Da bismo mogli koristiti MapReduce algoritam potrebno je razumjeti njegov način rada, te naučiti neka ...
Volarić, Karolina
core +2 more sources
ReStore: Reusing Results of MapReduce Jobs [PDF]
Analyzing large scale data has emerged as an important activity for many organizations in the past few years. This large scale data analysis is facilitated by the MapReduce programming and execution model and its implementations, most notably Hadoop ...
Aboulnaga, Ashraf, Elghandour, Iman
core +4 more sources
A Systematic Overview of Caching Mechanisms to Improve Hadoop Performance
ABSTRACT In today's distributed computing environments, the rapid generation of large‐scale data from diverse sources poses significant challenges in terms of storage, management, and processing, particularly for traditional relational databases. Hadoop has emerged as a widely adopted framework for handling such data through parallel processing across ...
Rana Ghazali, Douglas G. Down
wiley +1 more source
Bellwethers: A Baseline Method For Transfer Learning
Software analytics builds quality prediction models for software projects. Experience shows that (a) the more projects studied, the more varied are the conclusions; and (b) project managers lose faith in the results of software analytics if those results
Krishna, Rahul, Menzies, Tim
core +1 more source
AsterixDB: A Scalable, Open Source BDMS [PDF]
AsterixDB is a new, full-function BDMS (Big Data Management System) with a feature set that distinguishes it from other platforms in today's open source Big Data ecosystem.
Alsubaiee, Sattam +22 more
core +1 more source
BestConfig: Tapping the Performance Potential of Systems via Automatic Configuration Tuning
An ever increasing number of configuration parameters are provided to system users. But many users have used one configuration setting across different workloads, leaving untapped the performance potential of systems.
Bao, Yungang +7 more
core +1 more source
This paper discusses the evolution of Apache Hive and Shark.We take a look on the design modification made to existing systems for leveraging higher efficiency and performance benefits.The paper finally concludes with a brief discussion on the future of the aforementioned technologies.
openaire +2 more sources

