Results 21 to 30 of about 436,506 (135)
Is a Dataframe Just a Table? [PDF]
Querying data is core to databases and data science. However, the two communities have seemingly different concepts and use cases. As a result, both designers and users of the query languages disagree on whether the core abstractions - dataframes (data ...
Wu, Yifan
core +1 more source
Uncovering the Relationship between Human Connectivity Dynamics and Land Use
CDR (Call Detail Record) data are one type of mobile phone data collected by operators each time a user initiates/receives a phone call or sends/receives an sms. CDR data are a rich geo-referenced source of user behaviour information.
Olivera Novović +4 more
doaj +1 more source
Integration of Cassandra and Spark in Computer Aided Drug Design
The primary purpose of this paper is to provide feasibility study of Cassandra and spark in Computer Aided Drug Design (CADD). The Apache Cassandra database is a big data management tool which can be used to store huge amount of data in different file ...
Nitha V R
semanticscholar +1 more source
Presented contribution is dedicated to discussion of two different approaches into increase of programming language safety. They are language subset and extension of original safety mechanisms.
Tomáš Brandejský, Vít Fábera
doaj +1 more source
Machine Learning in Apache Spark Environment for Diagnosis of Diabetes
Disease-related data and information collected by physicians, patients, and researchers seem insignificant at first glance. Still, the same unorganized data contain valuable information that is often hidden.
Farshid Bagheri Saravi +3 more
semanticscholar +1 more source
A Comparison of Big Data Frameworks on a Layered Dataflow Model [PDF]
In the world of Big Data analytics, there is a series of tools aiming at simplifying programming applications to be executed on clusters. Although each tool claims to provide better programming, data and execution models, for which only informal (and ...
Aldinucci, Marco +3 more
core +2 more sources
Computational Strategies for Scalable Genomics Analysis. [PDF]
The revolution in next-generation DNA sequencing technologies is leading to explosive data growth in genomics, posing a significant challenge to the computing infrastructure and software algorithms for genomics analysis.
Shi, Lizhen, Wang, Zhong
core +1 more source
A Tale of Two Data-Intensive Paradigms: Applications, Abstractions, and Architectures [PDF]
Scientific problems that depend on processing large amounts of data require overcoming challenges in multiple areas: managing large-scale data distribution, co-placement and scheduling of data with compute resources, and storing and transferring large ...
Fox, Geoffrey C. +4 more
core +1 more source
Large Scale Implementations for Twitter Sentiment Classification
Sentiment Analysis on Twitter Data is indeed a challenging problem due to the nature, diversity and volume of the data. People tend to express their feelings freely, which makes Twitter an ideal source for accumulating a vast amount of opinions towards a
Andreas Kanavos +5 more
doaj +1 more source
Human-Centric Program Synthesis [PDF]
Program synthesis techniques offer significant new capabilities in searching for programs that satisfy high-level specifications. While synthesis has been thoroughly explored for input/output pair specifications (programming-by-example), this paper asks:
Crichton, Will
core +2 more sources

