Results 31 to 40 of about 20,810 (198)

Mining Frequency of Drug Side Effects Over a Large Twitter Dataset Using Apache Spark [PDF]

open access: yes, 2017
Despite clinical trials by pharmaceutical companies as well as current FDA reporting systems, there are still drug side effects that have not been caught. To find a larger sample of reports, a possible way is to mine online social media. With its current
Hsu, Dennis
core   +3 more sources

Time series analysis with apache spark and its applications to energy informatics

open access: yesEnergy Informatics, 2018
In energy economy forecasts of different time series are rudimentary. In this study, a prediction for the German day-ahead spot market is created with Apache Spark and R.
Cornelia Krome, Volker Sander
doaj   +1 more source

SpaRC: scalable sequence clustering using Apache Spark [PDF]

open access: yesBioinformatics, 2018
Abstract Motivation Whole genome shotgun based next-generation transcriptomics and metagenomics studies often generate 100–1000 GB sequence data derived from tens of thousands of different genes or microbial species.
Lizhen Shi   +4 more
openaire   +4 more sources

CLASSIFICATION OF BIG POINT CLOUD DATA USING CLOUD COMPUTING [PDF]

open access: yesThe International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 2015
Point cloud data plays an significant role in various geospatial applications as it conveys plentiful information which can be used for different types of analysis.
K. Liu, J. Boehm
doaj   +1 more source

Scientific Computing Meets Big Data Technology: An Astronomy Use Case

open access: yes, 2015
Scientific analyses commonly compose multiple single-process programs into a dataflow. An end-to-end dataflow of single-process programs is known as a many-task application.
Barbary, Kyle   +7 more
core   +2 more sources

Monitoring Data Integrity in Big Data Analytics Services [PDF]

open access: yes, 2018
Enabled by advances in Cloud technologies, Big Data Analytics Services (BDAS) can improve many processes and identify extra information from previously untapped data sources. As our experience with BDAS and its benefits grows and technology for obtaining
Kloukinas, C.   +2 more
core   +1 more source

Apache Calcite: A Foundational Framework for Optimized Query Processing Over Heterogeneous Data Sources [PDF]

open access: yes, 2018
Apache Calcite is a foundational software framework that provides query processing, optimization, and query language support to many popular open-source data processing systems such as Apache Hive, Apache Storm, Apache Flink, Druid, and MapD.
Begoli, Edmon   +4 more
core   +2 more sources

Diaspore: Diagnosing Performance Interference in Apache Spark

open access: yesIEEE Access, 2021
Apache Spark is being increasingly used to execute big data applications on cluster computing platforms. To increase system utilization, cluster operators often configure their clusters such that multiple co-located applications can simultaneously share ...
Sarah Shah   +2 more
doaj   +1 more source

Accelerating Large-Scale Data Analysis by Offloading to High-Performance Computing Libraries using Alchemist

open access: yes, 2018
Apache Spark is a popular system aimed at the analysis of large data sets, but recent studies have shown that certain computations---in particular, many linear algebra computations that are the basis for solving common machine learning problems---are ...
Gerhardt, Lisa   +8 more
core   +1 more source

A Tale of Two Data-Intensive Paradigms: Applications, Abstractions, and Architectures [PDF]

open access: yes, 2014
Scientific problems that depend on processing large amounts of data require overcoming challenges in multiple areas: managing large-scale data distribution, co-placement and scheduling of data with compute resources, and storing and transferring large ...
Fox, Geoffrey C.   +4 more
core   +1 more source

Home - About - Disclaimer - Privacy