Flink - Open Access .click

Results 1 to 10 of about 50,991 (186)

SPARQL2Flink: Evaluation of SPARQL Queries on Apache Flink

Applied Sciences (Switzerland), 2021
Existing SPARQL query engines and triple stores are continuously improved to handle more massive datasets. Several approaches have been developed in this context proposing the storage and querying of RDF data in a distributed fashion, mainly using the ...
Oscar Ceballos +2 more
exaly +3 more sources

Continuous outlier mining of streaming data in flink [PDF]

Information Systems, 2020
In this work, we focus on distance-based outliers in a metric space, where the status of an entity as to whether it is an outlier is based on the number of other entities in its neighborhood. In recent years, several solutions have tackled the problem of distance-based outliers in data streams, where outliers must be mined continuously as new elements ...
Theodoros Toliopoulos +2 more
exaly +4 more sources

Apache Flink and clustering-based framework for fast anonymization of IoT stream data

Intelligent Systems With Applications, 2023
In this paper, we present a novel framework that considers the expiration period time of the Internet of Things (IoT) data stream to anonymize it. IoT stands among one of most fast-growing technology in the world. Also, anonymity is one of the safeguards
Alireza Sadeghi-Nasab, Hossein Ghaffarian, Mohsen Rahmani +2 more
exaly +3 more sources

FlinkCheck: Property-Based Testing for Apache Flink

IEEE Access, 2019
Apache Flink is an open-source, soft real-time stream processing framework underlying many modern systems dealing with cloud and real-time computing, data analytics, and the Internet of Things, among others. As the complexity of stream-processing systems
Enrique Martín-Martín +2 more
exaly +3 more sources

DPASF: a flink library for streaming data preprocessing [PDF]

Big Data Analytics, 2019
Background Data preprocessing techniques are devoted to correcting or alleviating errors in data. Discretization and feature selection are two of the most extended data preprocessing techniques.
Alejandro Alcalde-Barros +3 more
doaj +3 more sources

Integration of Multi-Source Landslide Disaster Data Based on Flink Framework and APSO Load Balancing Task Scheduling

ISPRS International Journal of Geo-Information
As monitoring technologies and data collection methodologies advance, landslide disaster data reflects attributes such as diverse sources, heterogeneity, substantial volumes, and stringent real-time requirements.
Haibo Yang, Yingchun Cai
exaly +3 more sources

Node Priority Scheduling Strategy Based on Heterogeneous Flink Cluster [PDF]

Jisuanji gongcheng, 2022
The default task scheduling strategy of the Flink stream processing system ignores the cluster heterogeneity and available resources of nodes to a certain extent, resulting in an unbalanced overall cluster load.This study investigates the real-time ...
WANG Wenhao, SHI Xuerong
doaj +1 more source

Explainable Distance-Based Outlier Detection in Data Streams

IEEE Access, 2022
Explaining outliers is a topic that attracts a lot of interest; however existing proposals focus on the identification of the relevant dimensions. We extend this rationale for unsupervised distance-based outlier detection, and through investigating ...
Theodoros Toliopoulos, Anastasios Gounaris +1 more
doaj +1 more source

Benchmarking Distributed Stream Data Processing Systems [PDF]

, 2019
The need for scalable and efficient stream analysis has led to the development of many open-source streaming data processing systems (SDPSs) with highly diverging capabilities and performance characteristics.
Heiskanen, Henri +5 more
core +7 more sources

s2p: Provenance Research for Stream Processing System

Applied Sciences, 2021
The main purpose of our provenance research for DSP (distributed stream processing) systems is to analyze abnormal results. Provenance for these systems is not nontrivial because of the ephemerality of stream data and instant data processing mode in ...
Qian Ye, Minyan Lu
doaj +1 more source

big data
apache flink
machine learning

spark
apache spark
stream processing

mapreduce
distributed computing