No prerequisites are required for taking up this Hadoop Administration online training. Having a basic knowledge of Linux can help.
Hadoop is the most important framework for working with Big Data in a distributed environment. Today, due to the extensive implementation of Hadoop advanced features and concepts, there is a need for Advanced Hadoop Administration professionals. This Intellipaat training in Advanced Hadoop Admin makes you proficient in working with Hadoop administration at an advanced level. Upon the completion of the training, you can take up really high-paying jobs in the best companies around the world.
Introduction to advanced Hadoop admin concepts, learning the concepts of Applications, Node, Resource Manager components, connecting of RM to nodes, introduction to Container Manager in advanced Hadoop, monitoring of Containers, executing Containers, node status updater and Node Manager, log handler, Token Secret Managers and per application interacting components, learning the Web Server security and administrating clusters and the web application proxy server
Learning Apache Hive and Pig, various Hive services, clients, understanding the Managed Tables and External Table, the functions of Apache Pig and the concepts of partitioning and buckets
Introduction to Hadoop security with Kerberos authentication, various security threats in Hadoop and its solutions, securing the HDFS on huge clusters, understanding the three step Kerberos ticketing protocol, Kerberos setup steps, securing a Hadoop cluster, key distribution center installation, setting Kerberos client on Hadoop nodes, creating and distributing Key tab files in Hadoop services, setting up Hadoop service principles, configuration files of Hadoop, deploying Hoop for HDFS over HTTP, learning how HTTPFS works and how HDFS proxy differs and understanding the Cloudera Sentry, its salient features, the Apache Knox and the Knox gateway server
Introduction to Apache ZooKeeper, a distributed coordination service for distributed applications, the various applications of ZooKeeper, the services offered, its data model, understanding the Znodes and its varieties, various features of ZooKeeper like Znodes watches, reads and writes, managing of cluster, maintaining consistency and electing a leader in ZooKeeper ensemble and mutually exclusive distributed lock
The importance of Oozie workflow scheduler, Oozie installation, understanding the workflow engine, deep dive into Oozie workflow, the workflow application, submissions, state transitions, processing of job with Oozie, learning of Oozie security on Hadoop, submitting jobs to Hadoop, the concept of multi-tenancy and scalability, Oozie job timelines, various layers of abstraction, its architecture and coordinator, data and time triggers
Introduction to Apache Flume, Big Data ecosystem, physically distributed Data sources, changing structure of data, the anatomy of Flume, its core concepts, event, clients, agents, source, channels, sinks, interceptors, channel selector, sink processor, data ingest, agent pipeline, transactional data exchange, routing and replicating, why channels, use case: Log aggregation, adding Flume agent, handling a server farm, data volume per agent, example describing a single-node Flume deployment
Introduction to Hue, Hue ecosystem, what is Hue, Hue real-world view, advantages of Hue, how to upload data in file browser, view the content, integrating users, integrating HDFS, the fundamentals of Hue Frontend
Impala overview, goals, user view of Impala: SQL, Apache HBase, Impala architecture, Impala state store, Impala catalogue service, query execution phases and comparing Impala to Hive
At the end of the course, there will be a quiz and project assignments. Once you complete them, you will be awarded with Intellipaat Course Completion Certificate. Become in demand with Intellipaat certifications.