No prerequisites required for taking this Hadoop administration online training. Having a basic knowledge of Linux can help.
Hadoop is the most important framework for working with Big Data in a distributed environment. Today due to the extensive implementation of Hadoop advanced features and concepts there is a need for advance Hadoop administration professionals. This Intellipaat in advance Hadoop admin training makes you proficient in working with Hadoop administration at an advanced level. Upon completion of the training you can take up really high-paying jobs in the best companies around the world.
Introduction to advance Hadoop admin concepts, learning about the concepts of Applications, Node, Resource Manager components, connecting of RM to nodes, introduction to container manager in advanced Hadoop, monitoring of Containers, executing Containers, node status updater and node manager, log handler, Token Secret Managers, per application interacting components, learning about the Web Server security, administrating the clusters, the web application proxy server.
Learning about the Apache Hive and Pig, the various Hive services, clients, understanding the Managed Tables and External Table, the functions of Apache Pig, the concepts of partitioning and buckets.
Introduction to Hadoop security with Kerberos authentication, the various security threats in Hadoop and its solutions, securing the HDFS on huge clusters, understanding the three step Kerberos ticketing protocol, Kerberos setup steps, securing a Hadoop cluster, key distribution center installation, setting Kerberos client on Hadoop nodes, creating and distributing Key tab files in Hadoop services, setting up Hadoop service principles, configuration files of Hadoop, deploying Hoop for HDFS over HTTP, learning how HTTPFS works and how HDFS proxy differs, understanding the Cloudera Sentry, its salient features, the Apache Knox and the Knox gateway server.
Introduction to Apache Zookeeper, a distributed coordination service for distributed applications, the various applications of Zookeeper, the services offered, its data model, understanding the Znodes and its varieties, the various features of Zookeeper like Znodes watches, reads, writes, managing of cluster, maintaining consistency, electing a leader in Zookeeper ensemble, mutually exclusive distributed lock.
The importance of Oozie workflow scheduler, Oozie installation, understanding the workflow engine, deep dive into Oozie workflow, the workflow application, submissions, state transitions, processing of job with Oozie, learning of Oozie security on Hadoop, submitting jobs to Hadoop, the concept of multi-tenancy and scalability, Oozie job timelines, the various layers of abstraction, its architecture and coordinator, data and time triggers.
Introduction to Apache Flume, Big data ecosystem, Physically distributed Data sources, Changing structure of Data, the Anatomy of Flume, its Core concepts, Event, Clients, Agents, Source, Channels, Sinks, Interceptors, Channel selector, Sink processor, Data ingest, Agent pipeline, Transactional data exchange, Routing and replicating, Why channels?, Use case- Log aggregation, Adding flume agent, Handling a server farm, Data volume per agent, Example describing a single node Flume deployment.
HUE introduction, HUE ecosystem, What is HUE?, HUE real world view, Advantages of HUE, How to upload data in File Browser?, View the content, Integrating users, Integrating HDFS, Fundamentals of HUE FRONTEND.
IMPALA Overview, Goals, User view of Impala: SQL, Apache HBase, Impala architecture, Impala state store, Impala catalogue service, Query execution phases, Comparing Impala to Hive.
At the end of the course, there will be a quiz and project assignments. Once you complete them, you will be awarded with Intellipaat Course Completion certificate. Become in demand with Intellipaat certifications.