Apache spark - Open Access .click

Results 11 to 20 of about 20,810 (198)

TRANSMUT‐Spark: Transformation mutation for Apache Spark [PDF]

Software Testing, Verification and Reliability, 2022
SummaryThis paper proposesTRANSMUT‐Sparkfor automating mutation testing of big data processing code within Spark programs. Apache Spark is an engine for big data analytics/processing that hides the inherent complexity of parallel big data programming. Nonetheless, programmers must cleverly combine Spark built‐in functions within programs and guide the ...
João Batista de Souza Neto +3 more
openaire +4 more sources

Efficient processing of complex XSD using Hive and Spark [PDF]

PeerJ Computer Science, 2021
The eXtensible Markup Language (XML) files are widely used by the industry due to their flexibility in representing numerous kinds of data. Multiple applications such as financial records, social networks, and mobile networks use complex XML schemas with
Diana Martinez-Mosquera, Rosa Navarrete, Sergio Luján-Mora +2 more
doaj +2 more sources

Efficient Group K Nearest-Neighbor Spatial Query Processing in Apache Spark

ISPRS International Journal of Geo-Information, 2021
Aiming at the problem of spatial query processing in distributed computing systems, the design and implementation of new distributed spatial query algorithms is a current challenge.
Panagiotis Moutafis +3 more
doaj +1 more source

Comparative Study of Record Linkage Approaches for Big Data

Walailak Journal of Science and Technology, 2021
Record linkage is a challenging task for Big Data. This paper, hence, attempts to shed light on record linkage approaches for Big Data by comparing three dimensions involving record linkage phases, dataset properties, and parallel processing approach ...
Randa MOHAMED +3 more
doaj +3 more sources

Apache Spark usage and deployment models for scientific computing [PDF]

EPJ Web of Conferences, 2019
This talk is about sharing our recent experiences in providing data analytics platform based on Apache Spark for High Energy Physics, CERN accelerator logging system and infrastructure monitoring.
Castro Diogo +4 more
doaj +1 more source

Apache Spark ile Makine Öğrenmesi Destekli Diyabet Rahatsızlığı Tahmini

Düzce Üniversitesi Bilim ve Teknoloji Dergisi, 2022
Diyabet rahatsızlığı, insan vücudunun organlarını etkileyen kritik sağlık sorunlarından biridir. Bu nedenle, diyabet, 21. yüzyılda küresel bir sağlık sorunu olarak kabul edilmektedir.
Emre Yıldırım, Ali Çalhan
doaj +1 more source

CHANGE DETECTION OF MOBILE LIDAR DATA USING CLOUD COMPUTING [PDF]

The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 2016
Change detection has long been a challenging problem although a lot of research has been conducted in different fields such as remote sensing and photogrammetry, computer vision, and robotics. In this paper, we blend voxel grid and Apache Spark together
K. Liu, J. Boehm, C. Alis
doaj +1 more source

Performance Evaluation of Query Plan Recommendation with Apache Hadoop and Apache Spark

Mathematics, 2022
Access plan recommendation is a query optimization approach that executes new queries using prior created query execution plans (QEPs). The query optimizer divides the query space into clusters in the mentioned method.
Elham Azhir +3 more
doaj +1 more source

Дослідження продуктивності кластера Apache Spark на платформі Azure для методів машинного навчання

Збірник наукових праць Харківського національного університету Повітряних Сил, 2020
Розглянуто та досліджено питання підвищення продуктивності застосування моделей та методів задач машинного навчання з використанням Apache Spark Azure HDInsight.
С.В. Мінухін
doaj +1 more source

Optimization Techniques for a Distributed In-Memory Computing Platform by Leveraging SSD

Applied Sciences, 2021
In this paper, we present several optimization strategies that can improve the overall performance of the distributed in-memory computing system, “Apache Spark”.
June Choi, Jaehyun Lee, Jik-Soo Kim, Jaehwan Lee +3 more
doaj +1 more source

big data
spark
machine learning

hadoop
apache hadoop
mapreduce

technology
medicine
3. good health