Results 131 to 140 of about 5,534 (158)
Some of the next articles are maybe not open access.

Exploration of Apache Hadoop Techniques: Mapreduce and Hive for Big Data

Communications in Computer and Information Science, 2018
With the rapid growth of technology, huge amount of data is being proliferated from various sources like sensor networks, IoT, online transactions, social media, etc. Big data is a collection of huge voluminous and complex data sets that include the large amount of data, social media analytics, real time data and data management capabilities.
Dr P K Gupta
exaly   +2 more sources

Creating an Apache Hive Table with MongoDB

2015
MongoDB is the leading NoSQL database. MongoDB is based on the BSON (Binary JSON) JSON-style document format, which is based on dynamic schemas providing flexibility in storage. The Hive MongoDB Storage Handler makes it feasible to access MongoDB from Apache Hive. A Hive external table may be created on a MongoDB document store.
exaly   +2 more sources

Performance analysis of MySQL partition, hive partition-bucketing and Apache Pig

2016 1st India International Conference on Information Processing (IICIP), 2016
Streaming data analysis has attracted attention In various applications like financial records, data analysis, etc. Such type of applications require continuous storage of large amount of data in data warehouse while simultaneously providing quick response time for the queries against the data that is stored in the system. The duration of fetching data
exaly   +2 more sources

Large-Scale Data Analytics Tools: Apache Hive, Pig, and HBase

2016
The Apache Hadoop is an open-source project which allows for the distributed processing of huge data sets across clusters of computers using simple programming models. It is designed to handle massive amounts of data and has the ability to store, analyze, and access large amounts of data quickly, across clusters of commodity hardware.
N Maheswari
exaly   +2 more sources

Performance evaluation between Apache Pig and Hive on Covid-19 vaccination progress

AIP Conference Proceedings, 2023
Mohamed Faris Laham   +3 more
exaly   +2 more sources

GHive: A Demonstration of GPU-Accelerated Query Processing in Apache Hive

Proceedings of the 2022 International Conference on Management of Data, 2022
As a distributed, fault-tolerant data warehouse system for large-scale data analytics, Apache Hive has been used for various applications in many organizations (e.g., Facebook, Amazon, and Huawei). Exploiting the large degrees of parallelism of GPU to improve the performance of online analytical processing (OLAP) in database system is a common practice
Haotian Liu   +18 more
openaire   +2 more sources

Major technical advancements in apache hive

Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data, 2014
Apache Hive is a widely used data warehouse system for Apache Hadoop, and has been adopted by many organizations for various big data analytics applications. Closely working with many users and organizations, we have identified several shortcomings of Hive in its file formats, query planning, and query execution, which are key factors determining the ...
Yin Huai   +9 more
openaire   +1 more source

Modeling Apache Hive based applications in Big Data architectures

Proceedings of the 7th International Conference on Performance Evaluation Methodologies and Tools, 2014
Performance prediction for Big Data applications is a powerful tool supporting designers and administrators in achieving a better exploitation of their computing resources. Big Data architectures are complex, continuously evolving and adaptive, thus a rapid design and verification modeling approach can be fit to the needs.
Enrico Barbierato   +2 more
openaire   +2 more sources

Analyzing Performance of Apache Pig and Apache Hive with Hadoop

2018
Big Data is the term used for huge datasets which are very complex in nature and difficult to be processed using traditional devices. The current requirement is for a new technology for analyzing these huge datasets. One of the best options is Apache Hadoop as it consists of various components which work simultaneously to provide an efficient and ...
Krati Bansal   +2 more
openaire   +1 more source

Home - About - Disclaimer - Privacy