Our ETL tools master’s program will let you gain proficiency in top ETL tools like Informatica, SSIS, OBIEE, Talend, DataStage and Pentaho. You will work on real-world projects in data warehousing, data integration, Hadoop connectivity, data modelling, SCD and data schema.
Self Paced Training
This Intellipaat training in ETL tools will give you a powerful head start when it comes to extract, transform and load processes that exclusively cater to the Business Intelligence domain. This all-in-one course in ETL includes six most powerful ETL tools, and upon the completion of the training you will be a certified ETL professional.
Anybody can take up this training course. However, having a basic knowledge of SQL can be helpful.
The process of ETL is of absolute importance in any data warehousing and Business Intelligence scenario. Getting the ETL process right has a direct bearing on the type of data that will be loaded in the data warehouse, and this directly affects the quality of Business Intelligence that is derived and finally the business insights that are reached at. This Intellipaat training is your one stop for mastering some of the best ETL tools available in the market today. Upon the completion of this course, you can command best salaries in the ETL, data warehousing and Business Intelligence domains in top MNCs around the world.
What is data warehousing, understanding the extract, transform and load processes, what is data aggregation, data scrubbing and data cleansing and the importance of Informatica PowerCenter ETL
Configuring the Informatica tool and how to install the Informatica operational administration activities and integration services
Hands-on Exercise: Step-by-step process for the installation of Informatica PowerCenter
Understanding the difference between active and passive transformations and the highlights of each transformation
Learning about expression transformation and connected passive transformation to calculate value on a single row
Hands-on Exercise: Calculate value on a single row using connected passive transformation
Different types of transformations like sorter, sequence generator and filter, the characteristics of each and where they are used
Hands-on Exercise: Transform data using the filter technique, use a sequence generator and use a sorter
Working with joiner transformation to bring data from heterogeneous data sources
Hands-on Exercise: Use joiner transformation to bring data from heterogeneous data sources
Understanding the ranking and union transformation, the characteristics and deployment
Hands-on Exercise: Perform ranking and union transformation
Learn the rank and dense rank functions and the syntax for them
Hands-on Exercise: Perform rank and dense rank functions
Understanding how router transformation works and its key features
Hands-on Exercise: Perform router transformation
Lookup transformation overview and different types of lookup transformations: connected, unconnected, dynamic and static
Hands-on Exercise: Perform lookup transformations: connected, unconnected, dynamic and static
What is SCD, processing in xml, learn how to handle a flat file, list and define various transformations, implement ‘for loop’ in PowerCenter, the concepts of pushdown optimization and partitioning, what is constraint-based loading and what is incremental aggregation
Hands-on Exercise: Load data from a flat file, implement ‘for loop’ in PowerCenter, use pushdown optimization and partitioning, do constraint-based data loading and use incremental aggregation technique to aggregate data
Different types of designers: Mapplet and Worklet, target load plan, loading to multiple targets and linking property
Hands-on Exercise: Create a mapplet and a worklet, plan a target load and load multiple targets
Objectives of performance tuning, defining performance tuning and learning the sequence for tuning
Hands-on Exercise: Do performance tuning by following different techniques
Managing repository, Repository Manager: the client tool, functionalities of previous versions and important tasks in Repository Manager
Hands-on Exercise: Manage tasks in Repository Manager
Understanding and adopting best practices for managing repository
Common tasks in workflow manager, creating dependencies and the scope of workflow monitor
Hands-on Exercise: Create workflow with dependencies of nodes
Define the variable and parameter in Informatica, parameter files and their scope, the parameter of mapping, worklet and session parameters, workflow and service variables and basic development errors
Hands-on Exercise: Define variables and parameters in functions, use the parameter of mapping, use worklet and session parameters and use workflow and service variables
Session and workflow log, using debuggers, error-handling framework in Informatica and failover and high availability in Informatica
Hands-on Exercise: Debug development errors, read workflow logs and use the error-handling framework
Configurations and mechanisms in recovery and checking health of PowerCenter environment
Hands-on Exercise: Configure recovery options and check health of PowerCenter environment
Using commands: infacmd, pmrep and infasetup and processing of a flat file
Hands-on Exercise: Use commands: infacmd, pmrep and infasetup
Fixed length and delimited, expression transformations: sequence numbers and dynamic targeting using transaction control
Hands-on Exercise: Perform expression transformations: sequence numbers and dynamic targeting using transaction control
Dynamic target with the use of transaction control and indirect loading
Hands-on Exercise: Use of transaction control with dynamic target and indirect loading
Importance of Java transformations to extend PowerCenter capabilities, transforming data and active and passive mode
Hands-on Exercise: Use Java transformations to extend PowerCenter capabilities
Understanding the unconnected stored procedure in Informatica and different scenarios of unconnected stored procedure usage
Hands-on Exercise: Use the unconnected stored procedure in Informatica in different scenarios
Using SQL transformation (active and passive)
Hands-on Exercise: Use SQL transformation (active and passive)
Understanding incremental loading and aggregation and comparison between them
Hands-on Exercise: Do incremental loading and aggregation
Working with database constraints using PowerCenter and understanding constraint-based loading and target load order
Hands-on Exercise: Perform constraint-based loading in a given order
Various types of XML transformation in Informatica and configuring a lookup as active
Hands-on Exercise: Perform XML transformation and configure a lookup as active
Understanding what data profiling in Informatica is, its significance in validating content and ensuring quality and structure of data as per business requirements
Hands-on Exercise: Create data profiling in Informatica and validate the content
Understanding workflow as a group of instructions/commands for integration services and learning how to create and delete workflow in Informatica
Hands-on Exercise: Create and delete workflow in Informatica
Understanding the database connection, creating a new database connection in Informatica and understanding various steps involved
Hands-on Exercise: Create a new database connection in Informatica
Working with relational database tables in Informatica, mapping for loading data from flat files to relational database files
Hands-on Exercise: Create mapping for loading data from flat files to relational database files
Understanding how to deploy PowerCenter for seamless LinkedIn connectivity with Informatica PowerCenter
Hands-on Exercise: Deploy PowerCenter for seamless LinkedIn connectivity with Informatica PowerCenter
Connecting Informatica PowerCenter with various data sources like social media channels such as Facebook, Twitter, etc.
Hands-on Exercise: Connect Informatica PowerCenter with various data sources like social media channels such as Facebook, Twitter, etc.
Pushdown optimization for load-balancing on the server for better performance and various types of partitioning for optimizing performance
Hands-on Exercise: Optimize using pushdown technique for load-balancing on the server for better performance and create various types of partitioning for optimizing performance
Understanding session cache, the importance of cache creation, implementing session cache and calculating cache requirement
Hands-on Exercise: Implement cache creation and work with session cache
For this project, you will be expected to carry out tasks like the creation of users, building roles, forming groups, a collaboration of users, roles, and groups, lock handling, creating sessions, workflow, and worklets.
Deploying Informatica ETL for Business Intelligence
For this project, you will have to access data from multiple sources, manage current and historic data with SCD, import source and target tables, etc. Extract the data and fetch it into staging. It will then go from the operational data store to the enterprise data warehouse, and generate reports and insights.
Deploying the ETL Transactions on Healthcare Data
Systematically load data within a hospital scenario for easy access. You will be extracting data from multiple sources, cleansing data and putting in the right format, and loading the data into the CRDW. You will create CRDW load schedules that are on daily, weekly, and monthly basis.
Case Study 1 - Banking Products Augmentation
This case study is about improving the profits of a bank by customizing the products and adding new products based on customer needs. You will construct a multidimensional model. Deploy a star-join schema, create demographic mini-dimensions, and work with Informatica aggregator transformations.
Case Study 2 - Employee Data Integration
In this case study, you will load a table with employee data using Informatica. You will carry out tasks like creating multiple shared tables, work with the plug-and-play capability of the framework, and code and framework reusability.
Introduction to Business Intelligence, understanding the concept of Data Modeling, Data Cleaning, learning about Data Analysis, Data Representation, Data Transformation.
Introduction to ETL, the various steps involved Extract, Transform, Load, using a user’s email ID to read a flat file, extracting the User ID from email ID, loading the data into a database table.
Introduction to Connection Managers – logical representation of a connection, the various types of Connection Managers – Flat file, database, understanding how to load faster with OLE DB, comparing the performance of OLE DB and ADO.net, learning about Bulk Insert, working with Excel Connection Managers and identifying the problems.
Learning what is Data Transformation, converting data from one format to another, understanding the concepts of Character Map, Data Column and Copy Column Transformation, import and export column transformation, script and OLEDB Command Transformation, understanding row sampling, aggregate and sort transformation, percentage and row sampling.
Understanding Pivot and UnPivot Transformation, understanding Audit and Row Count Transformation, working with Split and Join Transformation, studying Lookup and Cache Transformation, Integrating with Azure Analysis Services, elastic nature of MSBI to integrate with the Azure cloud service, scale out deployment option for MSBI, working with cloud-borne data sources and query analysis. Scaling out the SSIS package, deploying for tighter windows, working with larger amount of data sources, SQL Server vNext for enhancing SQL Server features, more choice of development languages and data types both on-premise and in the cloud.
Understanding data that slowly changes over time, learning the process of how new data is written over old data, best practices.Detail explanation of three types of SCDs –Type1, Type2 and Type3, and their differences.
Understanding how Fuzzy Lookup Transformation varies from Lookup Transformation, the concept of Fuzzy matching,
Learning about error rows configuration, package logging, defining package configuration, understanding constraints and event handlers.
Project 1: SSIS
Problem Statement: Create a data flow task to extract data from the XLS format and store it into the SQL database, store the subcategory and category-wise sales in a table of the database. Once you get the output, split the dataset into two other tables. Table 1 should contain three columns (Sales < 100,000), Category, and Subcategory. Table 2 should contain (Sales > 100,000), Subcategory, and Category columns. Also, the Sales column should be sorted in both tables. Divide the whole dataset into a ratio of 70:30 percent and store the results in two different tables in the database
Topics: Data Flow, ODBC Set up and Connection Manager, Flat File Connection, Transformation, Import Export Transformation, Split and Join Transformation, Merge and Union All Transformation
Case Study: SSIS
Problem Statement: Create the connection of OLDB & load the data in SQL Server from excel; Create transformation where you have to split the people’s age group; How to create constants and events in package; Create a project level and package parameter at the package level; How to extract the data in an Incremental Order;
Topics: Data Flow, ODBC Set up and Connection Manager, Transformation, Split & Join Transformation, Term Extraction and Lookup
Introduction to OBIEE,installation of OBIEE,What are data models and why you need them? The scope, reach and benefits of data modeling,data warehousing,sample OBIEE Report, the business requirement intrinsic in data modeling, various case studies, the data modeling implications and the impact of data modeling on business intelligence.
Introduction to Business Intelligence, the architecture of data flow,OBIEE architecture, stack description of BI technology,BI Server,BI Scheduler,displaying report with data,need for reporting in business, distinction between OLTP and OLAP, the BI platform in BI technology stack, the product and dimension hierarchy, multidimensional and relational analytical processing, types of Reports, multidimensional modelling.
Online Analytical Processing,the OBIEE admin tools, RPD, the important concepts & terminology, significance of OLAP in business intelligence life cycle, understanding various data schemas like star,designing with Star Schema,creation of physical layer & simple RPD, enterprise information model, snow flake and constellation, aggregate and calculated measures.
Introduction to Oracle Business Intelligence Enterprise Edition, overview of the OBIEE product, the Architecture of OBIEE, key features and components,creating a simple report, business model, hierarchy, presentation and mapping.
Understanding what is Oracle Business Intelligence Repository, installation of OBIEE on Windows system, directory structure installation, services,, analytics and interactive reporting, dashboard creation, multiple report creation, formula editing, column properties, altering
Understanding how to build a Business Model and Mapping Layer in BI Repository, creating the Presentation Layer,formatting of data, conditional formatting, saving the report, creating and sharing folder. Topics – Data format, Conditional format, Removing filters,Like,Advanced ,Save the report , Shared folder and my folder, Creating new folder
Working with the Enterprise Manager, testing and validating the Repository, cache disabling,dashboard prompt, filtering, editing of dashboard with action link.Water fall model.
Working with the Repository, creating Test Report, adding calculations, deploying OBIEE analysis, coming up with landing page UI and its features,repository variables, session and presentation variables.
Learning about the Oracle BI Presentation Catalog, accessing and managing objects, Report archiving and exporting, data grouping and limiting in analyses, data formatting, conditional formatting, master detail report, report creation with multiple subject areas, data mashup, visual analyzer, performance tile, BI functionality, waterfall model, graphs, pivot table, Pie chart, KPI watchList.
The OBIEE dashboard setup, basics of dashboard and dashboard pages, deploying Dashboard Builder for building Dashboards, editing, sharing, and saving Dashboard analysis,cache creation & clearing, ODBC functions in OBIEE, Logical Table Source, summary & detail report.
Securing the Oracle Business Intelligence Suite with Enterprise Manager, creating alerts, managing grouping and maintenance,administrating, the various types of security in OBIEE, object, task and folder level security, Report scheduling.
Project : Report formatting using OBIEE
Industry : General
Problem Statement : How to find the revenue generated for a business
Topics : This is an Oracle Business Intelligence project that is associated with creating complex dashboards and performing formatting of the report. You will gain hands-on experience in filtering and sorting of the report data depending on the business requirements. This project will also help you understand how to convert the data into graphs for easy visualization and analysis. As part of the project you will gain experience in calculating the subtotal and grand total in a business scenario while finding the revenue generated.
Working of Talend,Introduction to Talend Open Studio and its Usability,What is Meta Data?
Creating a new Job,Concept and creation of Delimited file,Using Meta Data and its Significance,What is propagation?,Data integration schema,Creating Jobs using t-filter row and string filter,Input delimation file creation
Job design and its features,What is a T map?,Data Aggregation,Introduction to triplicate and its Working,Significance and working of tlog,T map and its properties.
Extracting data from the source,Source and Target in Database (MySQL),Creating a connection, Importing Schema or Metadata
Calling and using Functions,What are Routines?,Use of XML file in Talend,Working of Format data functions,What is type casting?
Defining Context variable,Learning Parameterization in ETL,Writing an example using trow generator,Define and Implement Sorting,What is Aggregator?,Using t flow for publishing data,Running Job in a loop.
Learn to start Trish Server,Connectivity of ETL tool connect with Hadoop,Define ETL method,Implementation of Hive,Data Import into Hive with an example,An example of Partitioning in hive,Reason behind no customer table overwriting?,Component of ETL,Hive vs. Pig,Data Loading using demo customer,ETL Tool,Parallel Data Execution.
Big Data, Factors constituting Big Data,Hadoop and Hadoop Ecosystem,Map Reduce -Concepts of Map, Reduce, Ordering, Concurrency, Shuffle, Reducing, Concurrency ,Hadoop Distributed File System (HDFS) Concepts and its Importance,Deep Dive in Map Reduce – Execution Framework, Partitioner Combiner, Data Types, Key pairs,HDFS Deep Dive – Architecture, Data Replication, Name Node, Data Node, Data Flow, Parallel Copying with DISTCP, Hadoop Archives
Installing Hadoop in Pseudo Distributed Mode, Understanding Important configuration files, their Properties and Demon Threads,Accessing HDFS from Command Line
Map Reduce – Basic Exercises,Understanding Hadoop Eco-system,Introduction to Sqoop, use cases and Installation,Introduction to Hive, use cases and Installation,Introduction to Pig, use cases and Installation,Introduction to Oozie, use cases and Installation,Introduction to Flume, use cases and Installation,Introduction to Yarn
Mini Project – Importing Mysql Data using Sqoop and Querying it using Hive
How to develop Map Reduce Application, writing unit test,Best Practices for developing and writing, Debugging Map Reduce applications,Joining Data sets in Map Reduce
A. Introduction to Hive
What Is Hive?,Hive Schema and Data Storage,Comparing Hive to Traditional Databases,Hive vs. Pig,Hive Use Cases,Interacting with Hive
B. Relational Data Analysis with Hive
Hive Databases and Tables,Basic HiveQL Syntax,Data Types ,Joining Data Sets,Common Built-in Functions,Hands-On Exercise: Running Hive Queries on the Shell, Scripts, and Hue
C. Hive Data Management
Hive Data Formats,Creating Databases and Hive-Managed Tables,Loading Data into Hive,Altering Databases and Tables,Self-Managed Tables,Simplifying Queries with Views,Storing Query Results,Controlling Access to Data,Hands-On Exercise: Data Management with Hive
D. Hive Optimization
Understanding Query Performance,Partitioning,Bucketing,Indexing Data
E. Extending Hive
Topics : User-Defined Functions
F. Hands on Exercises – Playing with huge data and Querying extensively.
G. User defined Functions, Optimizing Queries, Tips and Tricks for performance tuning
A. Introduction to Pig
What Is Pig?,Pig’s Features,Pig Use Cases,Interacting with Pig
B. Basic Data Analysis with Pig
Pig Latin Syntax, Loading Data,Simple Data Types,Field Definitions,Data Output,Viewing the Schema,Filtering and Sorting Data,Commonly-Used Functions,Hands-On
Exercise: Using Pig for ETL Processing
C. Processing Complex Data with Pig
Complex/Nested Data Types,Grouping,Iterating Grouped Data,Hands-On Exercise: Analyzing Data with Pig
D. Multi-Data set Operations with Pig
Techniques for Combining Data Sets,Joining Data Sets in Pig,Set Operations,Splitting Data Sets,Hands-On Exercise
E. Extending Pig
Macros and Imports,UDFs,Using Other Languages to Process Data with Pig,Hands-On Exercise: Extending Pig with Streaming and UDFs
F. Pig Jobs
A. Introduction to Impala
What is Impala?,How Impala Differs from Hive and Pig,How Impala Differs from Relational Databases,Limitations and Future Directions Using the Impala Shell
B. Choosing the best (Hive, Pig, Impala)
Putting it all together and Connecting Dots,Working with Large data sets, Steps involved in analyzing large data
How ETL tools work in big data Industry,Connecting to HDFS from ETL tool and moving data from Local system to HDFS,Moving Data from DBMS to HDFS,Working with Hive with ETL Tool,Creating Map Reduce job in ETL tool,End to End ETL PoC showing Hadoop integration with ETL tool.
Major Project, Hadoop Development, cloudera Certification Tips and Guidance and Mock Interview Preparation, Practical Development Tips and Techniques, certification preparation
1. Project – Jobs
Problem Statement – It describes that how to create a job using metadata. For this it includes following actions:
Create XML File,Create Delimited File,Create Excel File,Create Database Connection
2. Hadoop Projects
A. Project – Working with Map Reduce, Hive, Sqoop
Problem Statement – It describes that how to import mysql data using sqoop and querying it using hive and also describes that how to run the word count mapreduce job.
B. Project – Connecting Pentaho with Hadoop Eco-system
Problem Statement – It includes:
Quick Overview of ETL and BI,Configuring Pentaho to work with Hadoop Distribution,Loading data into Hadoop cluster,Transforming data into Hadoop cluster
Extracting data from Hadoop Cluster
Free Career Counselling
This training course is designed for clearing the following exams:
The entire course content is in line with respective certification programs and helps you clear the requisite certification exams with ease and get the best jobs in top MNCs.
As part of this training, you will be working on real-time projects and assignments that have immense implications in the real-world industry scenarios, thus helping you fast-track your career effortlessly.
At the end of this training program, there will be quizzes that perfectly reflect the type of questions asked in the respective certification exams and help you score better marks.
Intellipaat Course Completion Certificate will be awarded on the completion of the project work (after the expert review) and upon scoring at least 60% marks in the quiz. Intellipaat certification is well recognized in top 80+ MNCs like Ericsson, Cisco, Cognizant, Sony, Mu Sigma, Saint-Gobain, Standard Chartered, TCS, Genpact, Hexaware, etc.
Our Alumni works at top 3000+ companies
Intellipaat is a market leader in the ETL tools training. Today ETL tools are increasingly used in business scenarios in order to efficiently derive insights from huge amounts of disparate data. The extract-transform-load process is pretty standard when it comes to getting data from diverse databases, cleansing, filtering, transforming and finally deploying the data into the destination database.
This training includes some of the most powerful and efficient ETL tools like Informatica, SSIS, OBIEE, Talend, DataStage and Pentaho. The entire course content of this combo training is created toward helping you clear multiple certifications exams, viz., Power Center Developer Certification, Oracle Business Intelligence Foundation Essentials Exam, Talend Data Integration Certified Developer Exam, IBM Certified Solution Developer – InfoSphere DataStage, Pentaho Business Analytics Implementation and Cloudera Spark and Hadoop Developer Certification (CCA175) Exam.
This is a completely career-oriented training designed by industry experts. Your training program includes real-time projects and step-by-step assignments to evaluate your progress and specifically designed quizzes for clearing the requisite certification exams.
Intellipaat also offers lifetime access to videos, course materials, 24/7 support and course material upgrades to the latest version at no extra fee. Hence, it is clearly a one-time investment.
At Intellipaat, you can enroll in either the instructor-led online training or self-paced training. Apart from this, Intellipaat also offers corporate training for organizations to upskill their workforce. All trainers at Intellipaat have 12+ years of relevant industry experience, and they have been actively working as consultants in the same domain, which has made them subject matter experts. Go through the sample videos to check the quality of our trainers.
Intellipaat is offering the 24/7 query resolution, and you can raise a ticket with the dedicated support team at anytime. You can avail of the email support for all your queries. If your query does not get resolved through email, we can also arrange one-on-one sessions with our trainers.
You would be glad to know that you can contact Intellipaat support even after the completion of the training. We also do not put a limit on the number of tickets you can raise for query resolution and doubt clearance.
Intellipaat is offering you the most updated, relevant, and high-value real-world projects as part of the training program. This way, you can implement the learning that you have acquired in real-world industry setup. All training comes with multiple projects that thoroughly test your skills, learning, and practical knowledge, making you completely industry-ready.
You will work on highly exciting projects in the domains of high technology, ecommerce, marketing, sales, networking, banking, insurance, etc. After completing the projects successfully, your skills will be equal to 6 months of rigorous industry experience.
Intellipaat actively provides placement assistance to all learners who have successfully completed the training. For this, we are exclusively tied-up with over 80 top MNCs from around the world. This way, you can be placed in outstanding organizations such as Sony, Ericsson, TCS, Mu Sigma, Standard Chartered, Cognizant, and Cisco, among other equally great enterprises. We also help you with the job interview and résumé preparation as well.
You can definitely make the switch from self-paced training to online instructor-led training by simply paying the extra amount. You can join the very next batch, which will be duly notified to you.
Once you complete Intellipaat’s training program, working on real-world projects, quizzes, and assignments and scoring at least 60 percent marks in the qualifying exam, you will be awarded Intellipaat’s course completion certificate. This certificate is very well recognized in Intellipaat-affiliated organizations, including over 80 top MNCs from around the world and some of the Fortune 500companies.
Apparently, no. Our job assistance program is aimed at helping you land in your dream job. It offers a potential opportunity for you to explore various competitive openings in the corporate world and find a well-paid job, matching your profile. The final decision on hiring will always be based on your performance in the interview and the requirements of the recruiter.