Efficient Data Processing with Apache Spark
Apache Spark has revolutionized the landscape of big data processing by harnessing the power of distributed computing to handle massive datasets. However, as Spark applications increase in size and complexity, effective performance tuning becomes essential.
openaire +1 more source
MaRe: Processing Big Data with application containers on Apache Spark. [PDF]
Capuccini M, Dahlö M, Toor S, Spjuth O.
europepmc +1 more source
pmTM-align: scalable pairwise and multiple structure alignment with Apache Spark and OpenMP. [PDF]
Chen W, Yao C, Guo Y, Wang Y, Xue Z.
europepmc +1 more source
SparkRA: Enabling Big Data Scalability for the GATK RNA-seq Pipeline with Apache Spark. [PDF]
Al-Ars Z, Wang S, Mushtaq H.
europepmc +1 more source
Enabling scalable single-cell transcriptomic analysis through distributed computing with Apache spark. [PDF]
Adil A +4 more
europepmc +1 more source
SparkDWM: a scalable design of a Data Washing Machine using Apache Spark. [PDF]
Hagan NKA, Talburt JR.
europepmc +1 more source
Elevating Smart Manufacturing with a Unified Predictive Maintenance Platform: The Synergy between Data Warehousing, Apache Spark, and Machine Learning. [PDF]
Su N, Huang S, Su C.
europepmc +1 more source
Data Streaming with Apache Spark
In today’s fast-paced world, real-time data streaming is crucial for many businesses and industries. With tools like Apache Spark and Hadoop, organizations can process vast amounts of data as it’s created, gaining insights in seconds rather than hours.
openaire +1 more source
Using Apache Spark on genome assembly for scalable overlap-graph reduction. [PDF]
Paul AJ +5 more
europepmc +1 more source
SparkGA2: Production-quality memory-efficient Apache Spark based genome analysis framework. [PDF]
Mushtaq H, Ahmed N, Al-Ars Z.
europepmc +1 more source

