Results 121 to 130 of about 5,534 (158)
Some of the next articles are maybe not open access.
Accelerating Apache Hive with MPI for Data Warehouse Systems
2015 IEEE 35th International Conference on Distributed Computing Systems, 2015Data warehouse systems, like Apache Hive, have been widely used in the distributed computing field. However, current generation data warehouse systems have not fully embraced High Performance Computing (HPC) technologies even though the trend of converging Big Data and HPC is emerging.
Fan Liang, Xiao-Yi Lu
exaly +2 more sources
Performance Analysis of ECG Big Data using Apache Hive and Apache Pig
2019 8th International Conference on Information and Communication Technologies (ICICT), 2019Big Data has been observed as a revolution due to technological advancement since last few years. The process of examining the massive, gigantic, heterogeneous and multiplex datasets that are changing very often is called Big Data Analytics. Decision making by extracting information from complex and multi-structured data is not possible by using ...
Mudassar Ahmad +3 more
exaly +2 more sources
DHive: Query Execution Performance Analysis via Dataflow in Apache Hive
Proceedings of the VLDB Endowment, 2023Nowadays, Apache Hive has been widely used for large-scale data analysis applications in many organizations. Various visual analytical tools are developed to help Hive users quickly analyze the query execution process and identify the performance bottleneck of executed queries. However, existing tools mostly focus on showing the time usage of query sub-
Chaozu Zhang, Qiaomu Shen, Bo Tang 0016
exaly +2 more sources
A profiling tool for apache hive run-time query
2017 International Conference on Computing Methodologies and Communication (ICCMC), 2017Apache Hive is a tool used conventionally for data warehousing and analysis. Although it is widely used, there is very less research on performance comparison and analysis. One of the reasons is, the techniques applied to supervise execution cannot be implemented to intermediate MapReduce code developed from Hive query.
Divya Kamath +4 more
exaly +2 more sources
Profiling apache HIVE query from run time logs
2016 International Conference on Big Data and Smart Computing (BigComp), 2016Apache Hive is a widely used data warehousing and analysis tool. Developers write SQL like HIVE queries, which are converted into MapReduce programs to runs on a cluster. Despite its popularity, there is little research on performance comparison and diagnose.
Givanna Putri Haryono, Ying Zhou
exaly +2 more sources
Automated Table Partitioner (ATAP) in Apache Hive
2018 4th International Conference on Computer and Information Sciences (ICCOINS), 2018Big Data and Predictive Analytics have been a game-changing paradigm in academia and industry for the past decade, inspiring numerous efforts in multiple spaces. One of many such technologies is Hadoop, an open-sourced framework based on MapReduce for highly distributive and scalable solutions.
Thivviyan Amirthalingam, Helmi Md Rais
exaly +2 more sources
Automated Configuration Parameter Classfication Model for Hive Query Plan on the Apache Yarn
2019 IEEE International Conference on Big Data, Cloud Computing, Data Science & Engineering (BCD), 2019This research proposed an automated configuration parameter classification model to arrange optimized Hive Query processing environment on the Apache Hadoop Distributed File System. In this model, the Analysis statistic command issued to measuring expected performance for the Hive tables on the Hadoop yarn platform with varying combinations of ...
Jongyeop Kim +3 more
exaly +2 more sources
Apache Hive Performance Improvement Techniques for Relational Data
2019 International Artificial Intelligence and Data Processing Symposium (IDAP), 2019Hadoop is a widely adapted open-source map reduce implementation for storing and processing extremely large data sets. However, using Hadoop is not easy for end-users, especially for those who were not familiar with the map-reduce approach. Even for simple tasks, like getting raw counts or averages, users have to write map-reduce programs. Apache Hive,
Melih GÜnay
exaly +2 more sources
2017 IEEE 10th International Conference on Cloud Computing (CLOUD), 2017
Apache Hive, Apache Pig and Pivotal HWAQ are very popular open source cluster computing frameworks for large scale data analytics. These frameworks hide the complexity of task parallelism and fault-tolerance, by exposing a simple programming API to users.
Xin Chen +4 more
exaly +2 more sources
Apache Hive, Apache Pig and Pivotal HWAQ are very popular open source cluster computing frameworks for large scale data analytics. These frameworks hide the complexity of task parallelism and fault-tolerance, by exposing a simple programming API to users.
Xin Chen +4 more
exaly +2 more sources
2016
Apache Hive is data warehouse framework for storing, managing and querying large data sets. The Hive query language HiveQL is a SQL-like language. Hive stores data in HDFS by default, and a Hive table may be used to define structure on the data. Hive supports two kinds of tables: managed tables and external tables.
exaly +2 more sources
Apache Hive is data warehouse framework for storing, managing and querying large data sets. The Hive query language HiveQL is a SQL-like language. Hive stores data in HDFS by default, and a Hive table may be used to define structure on the data. Hive supports two kinds of tables: managed tables and external tables.
exaly +2 more sources

