Pig and Hive are used for same purpose. These are the tools that ease the complexity of writing difficult programs of java MapReduce. Hive is a data warehouse that uses MapReduce to analyze data stored on HDFS.It provides a query language called HiveQL that closely be similar to the common Structured Query Language (SQL) standard. It is developed by facebook. Hive was created to make it possible for analysts with strong SQL skills but too little java programming skills to run queries on the large volumes of data that Facebook stored in HDFS. Apache Pig and Hive are two projects that layer on top of Hadoop and provide a higher-level language for using MapReduce library of Hadoop.
This blog will help you get a better understanding of What is Hive?
It provides a query language which is based on the standard SQL instead of giving a way for rapidly development of map and reduce tasks. Hive takes HiveQL statements and then automatically transforms the queries into one or more MapReduce jobs. It then runs the overall MapReduce program and returns the output to the user whereas Hadoop streaming decreases the necessary code, compile, submit cycle, Hive removes it entirely and instead only requires the composition of HiveQL statements.
This interface to Hadoop not only accelerates the time required to produce results from data analysis but also it significantly expands who can use Hadoop and MapReduce.
What makes Hive Hadoop popular?
- Provides the users with strong and powerful statistics functions.
- It is like SQL so it is very easy to learn.
- It can be included with HBase for querying the data in HBase.This feature is not available in pig. In Pig a function named HbaseStorage () is used for loading the data from HBase.
- It is supported by Hue.
- It has various user groups like as CNET, Last.fm, Facebook, and Digg etc.
Difference between hive and pig
- Hive is used for data analysis whereas Pig is used for Research and Programs.
- Hive is used for structured Data whereas Pig is used for semi structured data.
- Hive has HiveQL whereas has Pig Latin.
- Hive is used for creating reports whereas Pig is used for programming.
- Hive works on the server side whereas Pig works on the client side.
- Hive does not support avro whereas Pig supports Avro.
hive>select * form employee; hive> describe employee;