Apache spark - Open Access .click

Results 31 to 40 of about 22,068 (223)

Privacy-Preserving Machine Learning on Apache Spark

IEEE Access, 2023
The adoption of third-party machine learning (ML) cloud services is highly dependent on the security guarantees and the performance penalty they incur on workloads for model training and inference.
Claudia V. Brito +4 more
doaj +1 more source

Dynamic Multi-Objective Optimization With jMetal and Spark: a Case Study [PDF]

, 2016
Technologies for Big Data and Data Science are receiving increasing research interest nowadays. This paper introduces the prototyping architecture of a tool aimed to solve Big Data Optimization problems.
C Coello +9 more
core +1 more source

Alchemist: An Apache Spark ⇔ MPI interface [PDF]

Concurrency and Computation: Practice and Experience, 2018
SummaryThe Apache Spark framework for distributed computation is popular in the data analytics community due to its ease of use, but its MapReduce‐style programming model can incur significant overheads when performing computations that do not map directly onto this model. One way to mitigate these costs is to off‐load computations onto MPI codes.
Alex Gittens +8 more
openaire +2 more sources

Combining Terrier with Apache Spark to Create Agile Experimental Information Retrieval Pipelines [PDF]

, 2018
Experimentation using IR systems has traditionally been a procedural and laborious process. Queries must be run on an index, with any parameters of the retrieval models suitably tuned.
Macdonald, Craig
core +1 more source

StreamApprox: Approximate Computing for Stream Analytics [PDF]

, 2017
Approximate computing aims for efficient execution of workflows where an approximate output is sufficient instead of the exact output. The idea behind approximate computing is to compute over a representative sample instead of the entire input dataset ...
Bhatotia, Pramod +5 more
core +1 more source

Evaluasi Kinerja MLLIB APACHE SPARK pada Klasifikasi Berita Palsu dalam Bahasa Indonesia

Jurnal Teknologi Informasi dan Ilmu Komputer, 2022
Machine learning digunakan untuk menganalisis, mengklasifikasikan, atau memprediksi data. Untuk melakukan tugas dari machine learning diperlukan alat bantu dengan kinerja serta lingkungan yang kuat demi mendapatkan akurasi dan efisiensi waktu yang baik.
Antonius Angga Kurniawan, Metty Mustikasari +1 more
doaj +1 more source

A Tale of Two Data-Intensive Paradigms: Applications, Abstractions, and Architectures [PDF]

, 2014
Scientific problems that depend on processing large amounts of data require overcoming challenges in multiple areas: managing large-scale data distribution, co-placement and scheduling of data with compute resources, and storing and transferring large ...
Fox, Geoffrey C. +4 more
core +1 more source

Time series analysis with apache spark and its applications to energy informatics

Energy Informatics, 2018
In energy economy forecasts of different time series are rudimentary. In this study, a prediction for the German day-ahead spot market is created with Apache Spark and R.
Cornelia Krome, Volker Sander
doaj +1 more source

CLASSIFICATION OF BIG POINT CLOUD DATA USING CLOUD COMPUTING [PDF]

The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 2015
Point cloud data plays an significant role in various geospatial applications as it conveys plentiful information which can be used for different types of analysis.
K. Liu, J. Boehm
doaj +1 more source

SpaRC: scalable sequence clustering using Apache Spark [PDF]

Bioinformatics, 2018
Abstract Motivation Whole genome shotgun based next-generation transcriptomics and metagenomics studies often generate 100–1000 GB sequence data derived from tens of thousands of different genes or microbial species.
Lizhen Shi +4 more
openaire +4 more sources

data mining
artificial intelligence
operating system

machine learning
database
physics

parallel computing
algorithm
data science