Flat 10% & upto 50% off + Free additional Courses. Hurry up!

Top PIG Interview Questions – Most Asked

Here are top 20 objective type sample PIG Interview Questions and their answers are given just below to them. These sample questions are framed by experts from Intellipaat who train for Apache PIG Training to give you an idea of type of questions which may be asked in interview. We have taken full care to give correct answers for all the questions. Do comment your thoughts. Happy Job Hunting!

Top Answers to PIG Interview Questions

1. Does Pig differ from MapReduce? If yes, how?
Yes, Pig differs from MapReduce because, in MapReduce, the group by operation is performed at reducer side and filter, and also in the map phase the projection is implemented. Pig Latin provides the operations that are similar toMapReduce, such as groupby, orderby, and filters. We can analyze the Pig script and data flow to find the error checking. Pig Latin is lower in cost to write and maintain compared to MapReduceJava code.
Go through this Apache Pig Tutorial now to get a better understanding of the concepts.
2. Explain the uses of PIG?
We can use Pig in three categories, they are
1. ETL data pipeline: It helps to populate our data warehouse. Pig can pipeline the data to an external application, it will wait until it’s finished, so that it has receive the processed data and continue from there. It is the most common use case for Pig.
2. Research on raw data:
3. Iterative processing.
Learn more about Apache Pig in this riveting blog post now.
3. Name the scalar data type and complex data types in Pig?
The scalar data types in pig are int, float, double, long, chararray, and bytearray.
The complex data types in Pig are map, tuple, and bag.
Map: The data element with the data type chararray where element has pig data type include complex data type
Example- [city’#’bang’,’pin’#560001]
In this city and pin are data element mapping to values.
Tuple: It is a collection of data types and it has fixed length. Tuple is having multiple fields and these are ordered.
Bag: It is a collection of tuples, but it is unordered, tuples in the bag are separated by comma
Example: {(‘Bangalore’, 560001),(‘Mysore’,570001),(‘Mumbai’,400001)
Interested in learning Apache Pig? Well, we have the Big Data Training Course to give you a head start in your career.
4. Explain the LOAD keyword in Pig script.
Load helps to load data from the file system. It is a relational operator
In the first step in data-flow language we need to mention the input, which is completed by using ‘load’ keyword.
The LOAD syntax is
LOAD ‘mydata’ [USING function] [AS schema];
Example- A = LOAD ‘intellipaat.txt’;
A = LOAD ‘intellipaat.txt’ USINGPigStorage(‘\t’);
5. What are the relation operations in Pig, explain any two with examples?
The relational operations in Pig:
foreach, order by, filters, group, distinct, join, limit.foreach: It takes a set of expressions and applies them to all records in the data pipeline to the next operator.
A =LOAD ‘input’ as (emp_name :charrarray, emp_id : long, emp_add : chararray, phone : chararray, preferences : map [] );
B = foreach A generate emp_name, emp_id;Filters: It contains a predicate and it allows us to select which records will be retained in our data pipeline.
Syntax: alias = FILTER alias BY expression;
Alias indicates the name of the relation, By indicates required keyword and the expression has Boolean.
Example: M = FILTER N BY F5 == 4;
Take this Hadoop online training to learn about Pig, Pig Latin and everything to deploy Pig in real world scenario.
6. Are Pig supports multi-line commands?
Yes, pig supports both single line and multi-line commands.
In single line command it executes the data, but it doesn’t store in the file system, but in multiple lines commands it stores the data into ‘/output’;/* , so it can store the data in HDFS.
7. Different execution modes available in Pig? Explain.
Three different execution modes available in Pig they are,
1. Interactive mode or Grunt mode
2. Batch mode or Script mode
3. Embedded mode
Interactive mode or grunt mode: Pig’s interactive shell is known as grunt shell. If no file is specified to run in Pig it will start.
grunt> run scriptfile.pig
grunt> exec scriptfile.pig
Batch mode or Script mode: Pig executes the specified commands in the script file.
Embedded mode: We can embed Pig programs in Java and we can run the programs from Java.
Get a clear understanding of Apache Pig in this riveting blog now.
8.What are the exception handling operators in Pig script?
Following operators are used for handling the exception in pig script.
DUMP: It helps to display the results on screen.
DESCRIBE: It helps to display the schema of aparticular relation.
ILLUSTRATE: It helps to display step by step execution of a sequence of pig statements
EXPLAIN: It helps to display the execution plan for Pig Latin statements.
Get to know more about Pig history, examples, Pig Vs. SQL in this article now.
9.The difference between the physical plan and logical plan in Pig script?
Both plans are created while to execute the pig script.
Physical plan: It is a series of MapReduce jobs while creating the physicalplan.
It’s divided into three physical operators such as Local Rearrange, Global Rearrange, and package. It illustrates the physical operators Pig will use to execute the script without referring to how they will execute in MapReduce Loading and storing functions are resolved in physical plan.
Example- A: Load(/emp:PigStorage(‘ ‘))
Logical plan: theLogical plan is a plan which is created for each line in the Pig scripts. It is produced after semantic checking and basic parsing. With every line, the logical plan for that particular program becomes extended and largerbecause each and every statement has its own logical plan.Loading and storing function are not resolved in logical plan.
Example: X: (Name: LOLoad schema: emp_id#36:bytearray,emp_name#37:bytearray,city#38:bytearray,salary#39:bytearray)Required Fields:null
10.Is Pig script case sensitive?
Pig script is both case sensitive and case insensitive. For example, in user defined functions, the field name, and relations are case sensitive,i.e, INTELLIPAAT is not same as intellipaat or M=load ‘test’ is not same as m=load ‘test’. And Pig script keywords are case insensitive i.e, LOAD is same as aload.
Give your career a big boost by learning Apache Pig through our Comprehensive Training Course now!
11. The difference between group and Cogroup operators in Pig?
Both the operators can work with one or more relations. Group and Cogroup operators are identical. Group operator collects all records with the same key. Cogroup is a combination of group and join, it is a generalization of a group instead of collecting records of one input depends on a key, it collects records of n inputs based on a key. At a time we can Cogroup upto 127 relations.
12. What is the function of UNION and SPLIT operators? Give examples?
Union operator helps to merge the contents of two or more relations.
Syntax: grunt> Relation_name3 = UNION Relation_name1, Relation_name2
Example: grunt> INTELLIPAAT = UNION intellipaat_data1.txt intellipaat_data2.txt
SPLIT operator helps to divide the contents of two or more relations.
Syntax: grunt> SPLIT Relationa1_name INTO Relationa2_name IF (condition1), Relation2_name (condition2);
Example: SPLIT student_details into student_details1 if marks<35, student_details2 if (8590);
13. How can we see only top 15 records from the student.txt out of100 records in the HDFS directory?
We should change the name student.txt into STUDENT it is the relation name. We can see the top 15 records in using limit operator
Result = limit student 15.
14. What is the use of BloomMapFile?
It is an extended class of MapFile. Its functionality is similar to MapFile. It is used in the Hbase table format, BloomMapFile uses dynamic Bloom filters to provide rapid membership test for the keys.
15. How does the Pig platform handle relational systems data?
There are two ways Pig can work with relational datasets.
1. Load relational data directly into the Hadoop framework, where Pig can access it.
2. Using database connectors, Pig can load data directly from a relational database system and we can access it.
Get this online Hadoop certification training for mastering Apache Pig.
16. What are the limitations of Pig?
Limitations of Pig:
1. During the Pig platform is designed for ETL-type use cases, it’s not a better choice for real-time scenarios.
2. Pig is not a good choice for pinpointing a single record in huge data sets
3. Pig is built on top of MapReduce, which is batch oriented.
17. Mention the common features in Pig and Hive?
The common features in Both Hive and Pig are
1. Internally both are converted the commands into MapReduce
2. Both provide a high-level abstraction on top of MapReduce
3. Both don’t support low-latency queries.
4. OLAP or OLTP is not supported.
Learn what SAS has to say about Hadoop and its distinctive advantages.
18. Differentiate between Pig Latin and Pig Engine?
Pig Latin is scripting language like Perl for searching huge data sets and it is made up of a series of transformations and operations that are applied to the input data to produce data.
Pig engine is an environment to execute the Pig Latin programs. It converts Pig Latin operators into a series of MapReduce jobs.
19. Explain the terms in the below syntax.EXPLAIN [-script pigscript] [-out path] [-brief] [-dot] [-paramparam_name = param_value] [-param_filefile_name] alias;
script: It is used to specify a Pig script
-out : Used to specify the output path (directory)
-brief :Does not expand nested plans
-dot:outputs a format that can be passed to the dot utility for graphical display – will generate a directed-acyclic-graph (DAG) of the plans in any supported format (.gif, .jpg …).
Alias: name of a relation.
-paramparam_name = param_value : used to see the parameters.
According to IBM, processing your data is simple with Apache Pig.
20. What are all stats classes in the org.apache.pig.tools.pigstats package?
Stat classes are in the package
• PigStats
• JobStats
• OutputStats
• InputStats.
Take charge of your career by studying Apache Pig through this professionally designed Training Course. 

"0 Responses on Top PIG Interview Questions - Most Asked"

Leave a Message

Your email address will not be published.

Training in Cities

Bangalore, Hyderabad, Chennai, Delhi, Kolkata, UK, London, Chicago, San Francisco, Dallas, Washington, New York, Orlando, Boston

100% Secure Payments. All major credit & debit cards accepted Or Pay by Paypal.


Sales Offer

  • To avail this offer, enroll before 26th October 2016.
  • This offer cannot be combined with any other offer.
  • This offer is valid on selected courses only.
  • Please use coupon codes mentioned below to avail the offer
DW offer

Sign Up or Login to view the Free Top PIG Interview Questions - Most Asked.