Flat 10% & up to 50% off + Free additional Courses. Hurry up!

Top Hive Interview Questions – Most Asked

Here are top 20 objective type sample Hive Interview Questions and their answers are given just below to them. These sample questions are framed by experts from Intellipaat who train for Hive Training to give you an idea of type of questions which may be asked in interview. We have taken full care to give correct answers for all the questions. Do comment your thoughts. Happy Job Hunting!

Top Answers to Hive Interview Questions

1. What is the definition of Hive? What is the present version of Hive and explain about ACID transactions in Hive?
Hive is an open source data warehouse system. We can use Hive for analyzing and querying in large datasets of Hadoopfiles. It’s similar to SQL. The present version of hive is 0.13.1.
Hive supports ACID transactions: The full form of ACID is Atomicity, Consistency, Isolation, and Durability. ACID transactions are provided at the row levels, there are Insert, Delete, and Updateoptions so that Hive supports ACID transaction.
• Insert
• Delete
• Update
Get a better understanding of Hive by going through this Hive Tutorial now.
2. What kind of data warehouse application is suitable for Hive? What are the types of tables in Hive?
Hive is not considered as a full database. The design rules and regulations of Hadoop and HDFS put restrictions on what Hive can do.
Hive is most suitable for data warehouse applications. Where
• Analyzing the relatively static data
• Less Responsive time
• No rapid changes in data.
Hive doesn’t provide fundamental features required for OLTP, Online Transaction Processing.Hive is suitable for data warehouse applications in large datasets.Two types of tables in Hive
1. Managed table
2. External table
Find out more about Hive in this blog post.
3. Can We Change settings within Hive Session? If Yes, How?
Yes we can change the settings within Hive session, using the SET command. It helps to change Hive job settings for an exact query.
Example: The following commands shows buckets are occupied according to the table definition.
hive> SET hive.enforce.bucketing=true;
We can see the current value of any property by using SET with the property name. SET will list all the properties with their values set byHive.
hive> SET hive.enforce.bucketing;
And this list will not include defaults of Hadoop. So we should use the below like
SET -v
It will list all the properties including the Hadoop defaults in the system.
Interested in learning Hive? Well, we have the comprehensive Hive Training Course to give you a head start in your career.
4. Is it possible to add 100 nodes when we have 100 nodes already in Hive? How.
Yes, we can add the nodes by following the below steps.
1)Take a new system create a new username and password
2) Install the SSH and with master node setup ssh connections
3) Add sshpublic_rsa id key to the authorized keys file
4) Add the new data node hostname, IP address and other details in /etc/hosts slaves file slave3.in slave3
5) Start the DataNode on New Node
6) Login to the new node like suhadoop or ssh -X [email protected]
7) Start HDFS of a newly added slave node by using the following command
./bin/hadoop-daemon.sh start data node
8) Check the output of jps command on a new node
5. Explain the concatenation function in Hive with an example.
Concatenate function will join the input strings.We can specify the
‘N’ number of strings separated by a comma.
CONCAT (‘Intellipaat’,’-‘,’is’,’-‘,’a’,’-‘,’eLearning’,’-’,’provider’);
So, every time we set the limits of the strings by ‘-‘. If it is common for every strings, then Hive provides another command CONCAT_WS. In this case,we have to specify the set limits of operator first.
CONCAT_WS (‘-‘,’Intellipaat’,’is’,’a’,’eLearning’,‘provider’);
Output: Intellipaat-is-a-eLearning-provider.
Learn more about the Apache Hive features, architecture, and HiveQL, in this article now.

6. Trim and Reverse function in Hive with examples?
Trim function will delete the spaces associated with a string.
To remove the Leading space
To remove the trailing space
In Reverse function, characters are reversed in the string.
7. How to change the column data type in Hive? Explain RLIKEin Hive.
We can change the column data type by using ALTER and CHANGE.
The syntax is :
ALTER TABLE table_name CHANGE column_namecolumn_namenew_datatype;
Example: If we want to change the data type of thesalary column from integer to bigint in the employee table.
ALTER TABLE employee CHANGE salary salary BIGINT;RLIKE: Its full form is Right-Like and it is a special function in the Hive. It helps to examine the two substrings. i.e, if the substring of A matches with B then it evaluates to true.
‘Intellipaat’ RLIKE ‘tell’  True
‘Intellipaat’ RLIKE ‘^I.*’  True (this is a regular expression)
Learn what is Hadoop Hive in this detailed blog post now.

8. Explain process to access sub directories recursively in Hive queries?
By using below commands we can access sub directories recursively in Hive
hive> Set mapred.input.dir.recursive=true;
hive> Set hive.mapred.supports.subdirectories=true;
Hive tables can be pointed to the higher level directory and this is suitable for the directory structure which is like /data/country/state/city/
9. How to skip header rows from a table in Hive?
Header records in log files
In the above three lines of headers that we do not want to include in our Hive query. To skip header lines from our tables in the Hive,set a table property that will allow us to skip the header lines.
name STRING,
id INT,
salary INT)
LOCATION ‘/user/data’
10. The maximum size of string data type supported by hive? Mention the Hive support binary formats?
The maximum size of string data type supported by hive is 2 GB.
Hive supports the text file format by default and it supports the binary format Sequence files, ORC files, Avro Data files, Parquet files.
Sequence files: Splittable, compressible and row oriented are the general binary format.
ORC files: Full form of ORC is optimized row columnar format files. It is a Record columnar file and column oriented storage file. It divides the table in row split. In each split stores that value of the first row in the first column and followed sub subsequently.
AVRO data files: It is same as a sequence file splittable, compressible and row oriented, but except the support of schema evolution and multilingual binding support.
Learn more about Hive in this Training Course to get ahead in your career!
11. What is the precedence order of HIVE configuration?
We are using a precedence hierarchy for setting the properties
1. SET Command in HIVE
2. The command line –hiveconf option
3. Hive-site.XML
4. Hive-default.xml
5. Hadoop-site.xml
6. Hadoop-default.xml
12. If you run a select * query in Hive, Why does it not run MapReduce?
It’s an optimization technique. Hive, fetch, task conversion property can minimize the latency of map-reduce overhead. When queried like SELECT, FILTER LIMIT queries, this property skips map reduce and using FETCH task. As a result, Hive can execute queries without running MapReduce task.
By default its value is minimal. Which optimize- SELECT *, FILTER on partition columns, LIMIT queries only, Where another value is more which optimize- SELECT, FILTER, LIMIT.
13. How Hive can improve performance with ORC format tables?
We can store the hive data in highly efficient manner in the Optimized Row Columnar file format. It can simplify many Hive file format limitations. We can improve the performance by using ORC files while reading, writing and processing the data.
Set hive.compute.query.using.stats-true;
Set hive.stats.dbclass-fs;
CREATE TABLE orc_table (
name string)
Need a reason to learn Apache Hadoop and Hive? Well, go through this blog post to find out why Hadoop is the new black.

14. Explain the functionality of Object-Inspector?
It helps to analyze the internal structure of row object and individual structure of columns in HIVE. It also provides a uniform way to access complex objects that can be stored in multiple formats in the memory.
Instance of Java class
A standard Java object
A lazily initialized object
The Object-Inspector tells structure of the object and also ways to access the internal fields inside the object.
15. Explain the reason for whenever we run hive query, new metastore_db is created. Why?
Local metastore is created when we run Hive in embedded mode. And before creating it checks whether the metastore exists or not and this metastore property is defined in the configuration file hive-site.xml. Property is“javax.jdo.option.ConnectionURL” with default value “jdbc:derby:;databaseName=metastore_db;create=true”.So to change the behavior of the location to an absolute path, so that from that location meta-store will be used.
Give your career a big boost by going through our Hive Training Course now!
16. Explain SerDe in Hive? Explain currently used SerDE classes in Hive.
The full form of SerDE is Serializer-Deserializer. It helps to read and write data from tables in Hive. Hive doesn’t have any Hadoop file system to store the data. Users are able to write files to HDFS with CREATE EXTERNAL TABLE or LOAD DATA INPATH. A SerDE is a customizable mechanism in Hive it uses to parse data stored in HDFS to be used by Hive.
Currently used SerDe classes in the Hive are
17. What is available mechanism for connecting from applications, when we run hive as a server?
1. Thrift Client: Using thrift you can call hive commands from various programming languages. Example: C++, PHP,Java, Python and Ruby.
2. JDBC Driver: JDBC Driver supports the Type 4 (pure Java) JDBC Driver
3. ODBC Driver: ODBC Driver supports the ODBC protocol.
18. How do we write our own custom SerDe?
End users want to read their own data format instead of writing, so the user wants to write a Deserializer than SerDe.
Example: The RegexDeserializer will deserialize the data using the configuration parameter ‘regex’, and a list of column names.
If our SerDe supports DDL, we probably want to implement a protocol based on DynamicSerDe. It’s non-trivial to write a “thrift DDL” parser.
19. Mention the date data type in Hive? Name the Hive data type collection.
The TIMESTAMP data type stores date in java.sql.timestamp format.
Three collection data types in Hive
Go through this IBM article and learn how Hive works perfectly with Big SQL

20. Can we run UNIX shell commands from Hive? Can Hive queries be executed from script files? How? Give an example.
Yes, we can run UNIX shell commands from Hive using the! Mark before the command .For example: !pwd at hive prompt will list the current directory.
We can execute Hive queries from the script files by using the source command.
Example −Hive> source /path/to/file/file_with_query.hql
Take charge of your career by going through this professionally designed Hive Training Course.

"0 Responses on Top Hive Interview Questions – Most Asked"

Leave a Message

Your email address will not be published.

Training in Cities

Bangalore, Hyderabad, Chennai, Delhi, Kolkata, UK, London, Chicago, San Francisco, Dallas, Washington, New York, Orlando, Boston

100% Secure Payments. All major credit & debit cards accepted Or Pay by Paypal.


Sales Offer

  • To avail this offer, enroll before 22nd October 2016.
  • This offer cannot be combined with any other offer.
  • This offer is valid on selected courses only.
  • Please use coupon codes mentioned below to avail the offer


Sign Up or Login to view the Free Top Hive Interview Questions – Most Asked.