Results 11 to 20 of about 2,924 (186)

Comparative Study of Record Linkage Approaches for Big Data

open access: yesWalailak Journal of Science and Technology, 2021
Record linkage is a challenging task for Big Data. This paper, hence, attempts to shed light on  record linkage approaches for Big Data by comparing three dimensions involving record linkage phases, dataset properties, and parallel processing approach ...
Randa MOHAMED   +3 more
doaj   +3 more sources

Optimization for Large-Scale Dimension Table Connection Technology in Distributed Environment [PDF]

open access: yesJisuanji kexue yu tansuo, 2022
The large-scale dimension table connection technology in the distributed environment is one of the key technologies in online big data analysis, which is widely used in real-time recommendation, real-time analysis and other fields.
ZHAO Hengtai, ZHAO Yuhai, YUAN Ye, JI Hangxu, QIAO Baiyou, WANG Guoren
doaj   +1 more source

GeoFlink: An Efficient and Scalable Spatial Data Stream Management System

open access: yesIEEE Access, 2022
This era is witnessing an exponential growth in spatial data due to the increase in GPS-enabled devices. Spatial data can be of extreme use to commercial businesses, governments and NGOs if processed timely.
Salman Ahmed Shaikh   +4 more
doaj   +1 more source

VeilGraph: incremental graph stream processing

open access: yesJournal of Big Data, 2022
Graphs are found in a plethora of domains, including online social networks, the World Wide Web and the study of epidemics, to name a few. With the advent of greater volumes of information and the need for continuously updated results under temporal ...
Miguel E. Coimbra   +3 more
doaj   +1 more source

DDoS attacks and machine‐learning‐based detection methods: A survey and taxonomy

open access: yesEngineering Reports, Volume 5, Issue 12, December 2023., 2023
This review paper discusses the Distributed Denial of Service (DDoS) attacks, the machine learning‐based detection methods of these attacks and the existing challenges. Some of the most commonly used public datasets are also compared, and their strengths and shortcomings are discussed.
Mohammad Najafimehr   +2 more
wiley   +1 more source

Scalable multi‐site photovoltaic power forecasting based on stream computing

open access: yesIET Renewable Power Generation, Volume 17, Issue 9, Page 2379-2390, 6 July 2023., 2023
This work proposes a multi‐site photovoltaic forecasting system that contains message queue and stream engine, where a forecasting model is continuously updated using real‐time data. A benchmark with 60 sites served was performed to verify the scalability of the system.
Yuxi Sun   +4 more
wiley   +1 more source

An investigation of distributed computing for combinatorial testing

open access: yesSoftware Testing, Verification and Reliability, Volume 33, Issue 4, June 2023., 2023
Combinatorial test generation is the process of generating sets of input parameters for a system under test, by considering interactions between t values of multiple parameters; the paper investigates the use of distributed algorithms to generate such test suites.
Edmond La Chance, Sylvain Hallé
wiley   +1 more source

s2p: Provenance Research for Stream Processing System

open access: yesApplied Sciences, 2021
The main purpose of our provenance research for DSP (distributed stream processing) systems is to analyze abnormal results. Provenance for these systems is not nontrivial because of the ephemerality of stream data and instant data processing mode in ...
Qian Ye, Minyan Lu
doaj   +1 more source

Automated issue assignment using topic modelling on Jira issue tracking data

open access: yesIET Software, Volume 17, Issue 3, Page 333-344, June 2023., 2023
In this work, we provide a methodology for automated issue assignment, designed using Jira issue tracking data extracted from the Apache Software Foundation that describe both features and bugs. Our methodology employs topic modelling to extract the semantics of text features, while optimising the LDA algorithm (number of topics) using the assignment ...
Themistoklis Diamantopoulos   +2 more
wiley   +1 more source

Influencing Factors in the Scalability of Distributed Stream Processing Jobs

open access: yesIEEE Access, 2021
More and more use cases require fast, accurate, and reliable processing of large volumes of data. To do this, a distributed stream processing framework is needed which can distribute the load over several machines.
Giselle Van Dongen, Dirk Van Den Poel
doaj   +1 more source

Home - About - Disclaimer - Privacy