Explore Online Courses
Free Courses
Hire from us
Become an Instructor
Reviews
All Courses
Submit
Submit
Take the Free Practice Test
Free Practice Test
Instructions:
FREE test and can be attempted multiple times.
60 Minutes
30 Multiple Choice Questions
Fill in the Details to Get Started
Select your preference
Self-learning and knowledge validation
Completed a course & revising
Just curious
By providing your contact details, you agree to our
Terms of Use
&
Privacy Policy
Welcome to your Hadoop Quiz
The size of block in HDFS is?
64 MB
512 bytes
1024 KB
None of the above
The switch given to 'hadoop fs' command for detailed help is ?
-help
-?
-show
None of the above
RPC means?
Remote process call
Remote procedure call
Remote processing call
None of the above
Which method of the FileSystem object is used for reading a file in HDFS ?
access()
select()
load()
open()
None of the above
How many states does Writable interface defines ?
Four
Three
Two
Six
None of the above
Which of the following illustrates the concept of distributed cache?
The distributed cache is a component that allows developers to deploy jars for Map-Reduce processing.
The distributed cache is special component on datanode that will cache frequently used data for faster client response. It is used during map step.
The distributed cache is special component on namenode that will cache frequently used data for faster client response. It is used during reduce step.
The distributed cache is a component that caches java objects.
How does the best performance can be measured in Hadoop and why ?
The best performance expectation one can have is measured in seconds. This is because Hadoop can only be used for batch processing
It depends on on the design of the map-reduce program, how many machines in the cluster, and the amount of data being retrieved
The best performance expectation one can have is measured in minutes. This is because Hadoop can only be used for batch processing
None of the above
All of the above
What does the term writable signifies in Hadoop and MapReduce ?
Writable is a java interface that needs to be implemented for streaming data to remote servers.
Writable is a java interface that needs to be implemented for HDFS writes.
Writable is a java interface that needs to be implemented for MapReduce processing.
All of the above
None of these
What is the advatage and optimization benefits for Writable data types in Hadoop and MapReduce as compared to the default Java classes ?
Writable data types are specifically optimized for data retrieval
Writable data types are specifically optimized for file system storage
Writable data types are specifically optimized for network transmissions
Writable data types are specifically optimized for map-reduce processing
Is there an implementation and availability of custom data type in MapReduce ?
Yes, but only for mappers.
No, Hadoop does not provide techniques for custom datatypes.
Yes, custom data types can be implemented as long as they implement writable interface.
Yes, but only for reducers.
All of the above
How are Pig and MapReduce related ?
Pig provides no additional capabilities to MapReduce. Pig programs are executed as MapReduce jobs via the Pig interpreter.
Pig programs rely on MapReduce but are extensible, allowing developers to do special-purpose processing not provided by MapReduce.
Pig provides the additional capability of allowing you to control the flow of multiple MapReduce jobs.
They are not related
Pig provides additional capabilities that allow certain types of data manipulation not possible with MapReduce.
Which tool will be used to generate Java classes which will process data imported into HDFS from relational database ?
Sqoop
Hue
Hive
Mapreduce
Pig
Which is an data warehousing software with SQL-like query language to run ad-hoc queries on HDFS ?
Hue
Pig
Oozie
Hive
Sqoop
What does oozie's workflow contain?
Sequences of MapReduce and Pig. These sequences can be combined with other actions including forks, decision points, and path joins.
Sequences of MapReduce and Pig jobs. These are limited to linear sequences of actions with exception handlers but no forks.
Sequences of MapReduce jobs only; no Pig or Hive tasks or jobs. These MapReduce sequences can be combined with forks and path joins.
Iterative repetition of MapReduce jobs until a desired answer or state is reached.
Which of the following is a distributed, scalable, data Store for random, realtime read/write data access ?
Pig
HBase
Hive
Hue
Oozie
Which tool can create and execute MapReduce jobs as mapper or reduce with any executable or script ?
Hadoop Streaming
Sqoop
Flume
Oozie
Hive
Which programming language has been used by MapReduce framework?
C
Java
Python
FORTRAN
What is the applicable programming language support as given by MapReduce ?
The most common programming language is Java, but scripting languages are also supported via Hadoop streaming.
Only Java supported since Hadoop was written in Java.
Any programming language that can comply with Map Reduce concept can be supported.
Currently Map Reduce supports Java, C, C++ and COBOL.
What is the significance of sequence files from the following given options ?
Sequence files are intermediate files that are created by Hadoop after the map step
Sequence files are binary format files that are compressed and are splitable. They are often used in high-performance map-reduce jobs
Sequence files are a type of the file in the Hadoop framework that allow data to be sorted
All of the above
None of the above
What does the term pig refers to ?
PIG is the third most popular form of meat in the US behind poultry and beef.
Pig is a part of the Apache Hadoop project that provides C-like scripting languge interface for data processing
Pig is a subset fo the Hadoop API for data processing
Pig is a part of the Apache Hadoop project. It is a 'PL-SQL' interface for data processing in Hadoop cluster
None of the above
Does AVRO data can be processed and used by MapReduce jobs ?
Avro specifies metadata that allows easier data access. This data cannot be used as part of map-reduce execution, rather input specification only.
Yes, but additional extensive coding is required
No, Avro was specifically designed for data storage only
Yes, Avro was specifically designed for data processing via Map-Reduce
Which of the following refers to the input format which is used as default ?
The default input format is xml. Developer can specify other input formats as appropriate if xml is not the correct input.
The default input format is TextInputFormat with byte offset as a key and entire line as a value.
There is no default input format. The input format always should be specified.
The default input format is a sequence file format. The data needs to be preprocessed before using the default input format.
Which of the following technique is used to incapacitate the reduce task ?
The Hadoop administrator has to set the number of the reducer slot to zero on all slave nodes. This will disable the reduce step.
While you cannot completely disable reducers you can set output to one. There needs to be at least one reduce step in Map-Reduce abstraction.
It is imposible to disable the reduce step since it is critical part of the Mep-Reduce abstraction.
A developer can always set the number of the reducers to zero. That will completely disable the reduce step.
Which statement is true?
Output of the reducer could be zero
output of the reducer is written to the HDF’s
in practice, the reducer usually emits a single key / value pair for each key
all of the above
what is data localization?
Before processing the data,bringing them to the local node.
Hadoop will start the map task on the node where data block is kept via HDFS
1 & 2 both are possible
None of the above is correct
If a mapper runs slow relative to other then?
If mapper is running slow then another mapper will be started by Hadoop
The result of the first mapper finished will be used
No reducer can start until last mapper finished
Hadoop will kill the slow mapper if it keep running if the new one finished
All of the above
What is combiner?
Runs locally on a single mapper’s output
Using combiner can reduce the network traffic
Generally combiner and reducer code is same
None of the above
All of the above
which is correct for the pseudo distributed mode of the hadoop?
This is a single machine cluster
It does not require to run all the daemon in this mode
1&2 are correct
All daemons run on the same machine
All of the above are correct
Which daemon is responsible for the housekeeping of the namenode?
NameNode itself
TaskTracker
JobTracker
Secondary namenode
Which daemon is responsible for insantiating and monitoring individual Map and Reduce task ?
Datanode
Secondary Namenode
JobTracker
TaskTracker
Time is Up!