0 votes
1 view
in Big Data Hadoop & Spark by (11.5k points)
I am new to Hive. I found it similar to RDBMS like tables, joins, partitions. According to my understanding Hive uses HDFS for storing data and it provides SQL abstraction over HDFS. Is Hive a database over HDFS like HBase, or is it a querying tool over HDFS.

But I doubt that Hive is a query language, as it has tables, joins & partitions.

1 Answer

0 votes
by (32.5k points)
No, we cannot call Apache Hive a relational database, as it is a data warehouse that is built on top of Apache Hadoop for providing data summarization, query and, analysis. It differs from a relational database in a way that it stores schema in a database and processed data into HDFS.

For processing, Hive provides a SQL-like interface to query data stored in various databases and file systems that integrate with Hadoop. It supports HiveQL(Hive Query Language), which automatically translates SQL-like queries into MapReduce jobs executed on Hadoop.

Hive is read-based and therefore not support transaction processing that typically involves a high percentage of write operations. It is best suited for batch jobs like weblog processing and is designed for OLAP workloads.

Related questions

Welcome to Intellipaat Community. Get your technical queries answered by top developers !