Introduction to Hadoop and its constituent ecosystem, understanding MapReduce and HDFS,
Big Data, Factors constituting Big Data, Hadoop and Hadoop Ecosystem, Map Reduce -Concepts of Map, Reduce, Ordering, Concurrency, Shuffle, Reducing, Concurrency, Hadoop Distributed File System (HDFS) Concepts and its Importance, Deep Dive in Map Reduce – Execution Framework, Partitioner, Combiner, Data Types, Key pairs, HDFS Deep Dive – Architecture, Data Replication, Name Node, Data Node, Data Flow, Parallel Copying with DISTCP, Hadoop Archives
Installing Hadoop in Pseudo Distributed Mode, Understanding Important configuration files, their Properties and Demon Threads, Accessing HDFS from Command Line, Map Reduce – Basic Exercises, Understanding Hadoop Eco-system, Introduction to Sqoop, use cases and Installation, Introduction to Hive, use cases and Installation, Introduction to Pig, use cases and Installation, Introduction to Oozie, use cases and Installation, Introduction to Flume, use cases and Installation, Introduction to Yarn, Mini Project – Importing Mysql Data using Sqoop and Querying it using Hive
How to develop Map Reduce Application, writing unit test, Best Practices for developing and writing, Debugging Map Reduce applications
What Is Pig?, Pig’s Features, Pig Use Cases, Interacting with Pig, Basic Data Analysis with Pig, Pig Latin Syntax, Loading Data, Simple Data Types, Field Definitions, Data Output, Viewing the Schema, Filtering and Sorting Data, Commonly-Used Functions, Hands-On Exercise: Using Pig for ETL Processing
What Is Hive?, Hive Schema and Data Storage, Comparing Hive to Traditional Databases, Hive vs. Pig, Hive Use Cases, Interacting with Hive, Relational Data Analysis with Hive, Hive Databases and Tables, Basic HiveQL Syntax, Data Types, Joining Data Sets, Common Built-in Functions, Hands-On Exercise: Running Hive Queries on the Shell, Scripts, and Hue
Why Hadoop testing is important, Unit testing, Integration testing, Performance testing, Diagnostics, Nightly QA test, Benchmark and end to end tests, Functional testing, Release certification testing, Security testing, Scalability Testing, Commissioning and Decommissioning of Data Nodes Testing, Reliability testing, Release testing
Understanding the Requirement, preparation of the Testing Estimation, Test Cases, Test Data, Test bed creation, Test Execution, Defect Reporting, Defect Retest, Daily Status report delivery, Test completion., ETL testing at every stage (HDFS, HIVE, HBASE) while loading the input (logs/files/records etc) using sqoop/flume which includes but not limited to data verification, Reconciliation., User Authorization and Authentication testing (Groups, Users, Privileges etc), Report defects to the development team or manager and driving them to closure., Consolidate all the defects and create defect reports., Validating new feature and issues in Core Hadoop.
Report defects to the development team or manager and driving them to closure, Consolidate all the defects and create defect reports, Validating new feature and issues in Core Hadoop, Responsible for creating a testing Framework called MR Unit for testing of MapReduce programs.
Automation testing using the OOZIE, Data validation using the query surge tool.
Test plan for HDFS upgrade, Test automation and result
How to test install and configure
Project 1 – Working with MapReduce, Hive, Sqoop
Problem Statement– It describes that how to import MySQL data using Sqoop and querying it using hive and also describes how to run the word count MapReduce job.
Project 2 : Testing Hadoop using MRUnit
Industry : General
Problem Statement : How to test the Hadoop Application using MRUnit testing
Topics : This project involves working with MRUnit for testing the Hadoop application without spinning a cluster. You will learn how to do the map and reduce test in an application.
Intellipaat is the pioneer of Hadoop training. This is a comprehensive Hadoop testing training that will provide you with all the requisite skills for detecting, analyzing and rectifying of errors in the Hadoop cluster. You will also gain knowledge of the various components of Hadoop like HDFS, MapReduce, Hive, Sqoop, Pig, HBase, Flume and Oozie. Master the various test case scenarios, POC implementation. Upon completion of the training you will be awarded the Intellipaat Hadoop Testing Certification.
Intellipaat offers lifetime access to videos, course materials, 24/7 Support, and course material upgrades to latest version at no extra fees. For Hadoop and Spark training you get the Intellipaat Proprietary Virtual Machine for Lifetime and free cloud access for 6 months for performing training exercises. Hence it is clearly a one-time investment. We are also exclusively partnered with IBM for providing you IBM Certified Hadoop Professional training as well.
This course is designed for clearing the Intellipaat Hadoop Testing Certification. The entire training course content has been designed by industry professionals in order to help you get the best jobs in the top MNCs. As part of this training you will be working on real time projects and assignments that have immense implications in the real world industry scenario thus helping you fast track your career effortlessly.
At the end of this training program there will be quizzes that perfectly reflect the type of questions asked in the respective certification exams and helps you score better marks in certification exam.
The certification will be awarded on the completion of Project work (on expert review) and upon scoring of at least 60% marks in the quiz. Intellipaat certification is well recognized in top 80+ MNCs like Ericsson, Cisco, Cognizant, Sony, Mu Sigma, Saint-Gobain, Standard Chartered, TCS, Genpact, Hexaware, etc.
A senior software architect at NextGen Healthcare who has previously worked with IBM Corporation. Suresh has worked on Big Data, Data Science, advanced analytics, Internet of Things, Azure along with AI domains like Machine Learning and Deep Learning. He has successfully implemented high impact projects in major corporations around the world.
An experienced blockchain professional who has been bringing integrated blockchain particularly Hyperledger and Ethereum and big data solutions to the cloud. David has previously worked on Hadoop, AWS Cloud, Big Data and Pentaho projects that have had major impact on revenues of marquee brands around the world.
"PMI®", "PMP®" and "PMI-ACP®" are registered marks of the Project Management Institute, Inc.
The Open Group®, TOGAF® are trademarks of The Open Group.
The Swirl logoTM is a trade mark of AXELOS Limited.
ITIL® is a registered trade mark of AXELOS Limited.
PRINCE2® is a Registered Trade Mark of AXELOS Limited.
Certified ScrumMaster® (CSM) and Certified Scrum Trainer® (CST) are registered trademarks of SCRUM ALLIANCE®
Professional Scrum Master is a registered trademark of Scrum.org