Apache Spark usage and deployment models for scientific computing [PDF]
This talk is about sharing our recent experiences in providing data analytics platform based on Apache Spark for High Energy Physics, CERN accelerator logging system and infrastructure monitoring.
Castro Diogo +4 more
doaj +1 more source
CHANGE DETECTION OF MOBILE LIDAR DATA USING CLOUD COMPUTING [PDF]
Change detection has long been a challenging problem although a lot of research has been conducted in different fields such as remote sensing and photogrammetry, computer vision, and robotics. In this paper, we blend voxel grid and Apache Spark together
K. Liu, J. Boehm, C. Alis
doaj +1 more source
Performance Evaluation of Query Plan Recommendation with Apache Hadoop and Apache Spark
Access plan recommendation is a query optimization approach that executes new queries using prior created query execution plans (QEPs). The query optimizer divides the query space into clusters in the mentioned method.
Elham Azhir +3 more
doaj +1 more source
Optimization Techniques for a Distributed In-Memory Computing Platform by Leveraging SSD
In this paper, we present several optimization strategies that can improve the overall performance of the distributed in-memory computing system, “Apache Spark”.
June Choi +3 more
doaj +1 more source
Дослідження продуктивності кластера Apache Spark на платформі Azure для методів машинного навчання
Розглянуто та досліджено питання підвищення продуктивності застосування моделей та методів задач машинного навчання з використанням Apache Spark Azure HDInsight.
С.В. Мінухін
doaj +1 more source
Laurelin: Java-native ROOT I/O for Apache Spark [PDF]
Apache Spark[1] is one of the predominant frameworks in the big data space, providing a fully-functional query processing engine, vendor support for hardware accelerators, and performant integrations with scientific computing libraries. One difficulty in
Melo Andrew, Shadura Oksana
doaj +1 more source
A distributed computing model for big data anonymization in the networks.
Recently big data and its applications had sharp growth in various fields such as IoT, bioinformatics, eCommerce, and social media. The huge volume of data incurred enormous challenges to the architecture, infrastructure, and computing capacity of IT ...
Farough Ashkouti, Keyhan Khamforoosh
doaj +1 more source
Performance Analysis of the Distributed Support Vector Machine Algorithm Using Spark for Predicting Flight Delays [PDF]
In big data analysis requires powerful machine learning frameworks, strategies, and environments to analyze data at scale. Therefore, Apache Spark is used as a cluster computing framework to process big data in parallel and can run on multiple clusters ...
Khotimah Husnul +4 more
doaj +1 more source
Multi-Objective Big Data Optimization with jMetal and Spark [PDF]
Big Data Optimization is the term used to refer to optimization problems which have to manage very large amounts of data. In this paper, we focus on the parallelization of metaheuristics with the Apache Spark cluster computing system for solving multi ...
A Cabanas-Abascal +11 more
core +1 more source
Deploying Apache Spark virtual clusters in cloud environments using orchestration technologies
Apache Spark is a framework providing fast computations on Big Data using MapReduce model. With cloud environments Big Data processing becomes more flexible since they allow to create virtual clusters on-demand. One of the most powerful open-source cloud
O. . Borisenko +2 more
doaj +1 more source

