Results 11 to 20 of about 20,810 (198)
TRANSMUT‐Spark: Transformation mutation for Apache Spark [PDF]
SummaryThis paper proposesTRANSMUT‐Sparkfor automating mutation testing of big data processing code within Spark programs. Apache Spark is an engine for big data analytics/processing that hides the inherent complexity of parallel big data programming. Nonetheless, programmers must cleverly combine Spark built‐in functions within programs and guide the ...
João Batista de Souza Neto +3 more
openaire +4 more sources
Efficient processing of complex XSD using Hive and Spark [PDF]
The eXtensible Markup Language (XML) files are widely used by the industry due to their flexibility in representing numerous kinds of data. Multiple applications such as financial records, social networks, and mobile networks use complex XML schemas with
Diana Martinez-Mosquera +2 more
doaj +2 more sources
Efficient Group K Nearest-Neighbor Spatial Query Processing in Apache Spark
Aiming at the problem of spatial query processing in distributed computing systems, the design and implementation of new distributed spatial query algorithms is a current challenge.
Panagiotis Moutafis +3 more
doaj +1 more source
Comparative Study of Record Linkage Approaches for Big Data
Record linkage is a challenging task for Big Data. This paper, hence, attempts to shed light on record linkage approaches for Big Data by comparing three dimensions involving record linkage phases, dataset properties, and parallel processing approach ...
Randa MOHAMED +3 more
doaj +3 more sources
Apache Spark usage and deployment models for scientific computing [PDF]
This talk is about sharing our recent experiences in providing data analytics platform based on Apache Spark for High Energy Physics, CERN accelerator logging system and infrastructure monitoring.
Castro Diogo +4 more
doaj +1 more source
Apache Spark ile Makine Öğrenmesi Destekli Diyabet Rahatsızlığı Tahmini
Diyabet rahatsızlığı, insan vücudunun organlarını etkileyen kritik sağlık sorunlarından biridir. Bu nedenle, diyabet, 21. yüzyılda küresel bir sağlık sorunu olarak kabul edilmektedir.
Emre Yıldırım, Ali Çalhan
doaj +1 more source
CHANGE DETECTION OF MOBILE LIDAR DATA USING CLOUD COMPUTING [PDF]
Change detection has long been a challenging problem although a lot of research has been conducted in different fields such as remote sensing and photogrammetry, computer vision, and robotics. In this paper, we blend voxel grid and Apache Spark together
K. Liu, J. Boehm, C. Alis
doaj +1 more source
Performance Evaluation of Query Plan Recommendation with Apache Hadoop and Apache Spark
Access plan recommendation is a query optimization approach that executes new queries using prior created query execution plans (QEPs). The query optimizer divides the query space into clusters in the mentioned method.
Elham Azhir +3 more
doaj +1 more source
Дослідження продуктивності кластера Apache Spark на платформі Azure для методів машинного навчання
Розглянуто та досліджено питання підвищення продуктивності застосування моделей та методів задач машинного навчання з використанням Apache Spark Azure HDInsight.
С.В. Мінухін
doaj +1 more source
Optimization Techniques for a Distributed In-Memory Computing Platform by Leveraging SSD
In this paper, we present several optimization strategies that can improve the overall performance of the distributed in-memory computing system, “Apache Spark”.
June Choi +3 more
doaj +1 more source

