Introduction to the IBM Information Server Architecture, the Server Suite components, the various tiers in the Information Server.
Understanding the IBM InfoSphere DataStage, the Job life cycle to develop, test, deploy and run data jobs, high performance parallel framework, real-time data integration.
Introduction to the design elements, various DataStage jobs, creating massively parallel framework, scalable ETL features, working with DataStage jobs.
Understanding the DataStage Job, creating a Job that can effectively extract, transform and load data, cleansing and formatting data to improve its quality.
Learning about data parallelism – pipeline parallelism and partitioning parallelism, the two types of data partitioning – Key-based partitioning and Keyless partitioning, detailed understanding of partitioning techniques like round robin, entire, hash key, range, DB2 partitioning, data collecting techniques and types like round robin, order, sorted merge and same collecting methods.
Understanding the various job stages – data source, transformer, final database, the various parallel stages – general objects, debug and development stages, processing stage, file stage types, database stage, real time stage, restructure stage, data quality and sequence stages of InfoSphere DataStage.
Understanding the parallel job stage editors, the important types of stage editors in DataStage.
Working with the Sequential file stages, understanding runtime column propagation, working with RCP in sequential file stages, using the sequential file stage as a source stage and target stage.
Understanding the difference between dataset and fileset and how DataStage works in each scenario.
Creating of a sample DataStage job using the dataset and fileset types of data.
Learning about the various properties of Sequential File Stage and Dataset stage.
Creating a lookup file set, working in parallel or sequential stage, learning about single input and output link.
Studying the Transformer Stage in DataStage, the basic working of this stage, characteristics -single input, any number of outputs and reject link, how it differs from other processing stages, the significance of Transformer Editor, and evaluation sequence in this stage.
Deep dive into Transformer functions – String, type conversion, null handling, mathematical, utility functions, understanding the various features like constraint, system variables, conditional job aborting, Operators and Trigger Tab.
Understanding the looping functionality in Transformer Stage, output with multiple rows for single input row, the procedure for looping, loop variable properties.
Connecting to the Teradata Enterprise Stage, properties of connection.
Generating data using Row Generator sequentially in a single partition, configuring to run in parallel.
Understanding the Aggregator Stage in DataStage, the two types of aggregation – hash mode and sort mode.
Deep learning of the various stages in DataStage, the importance of Copy, Filter and Modify stages to reduce number of Transformer Stages.
Understanding Parameter Set, storing DataStage and Quality Stage job parameters and default values in files, the procedure to deploy Parameter Sets function and its advantages.
Project 1 : Making sense of financial data
Industry : Financial Services
Problem Statement : Extract value from multiple sources & varieties of data in the financial domain
Description : In this project you will learn how to work with disparate data in the financial services domain and come up with valuable business insights. You will deploy IBM InfoSphere DataStage for the entire Extract, Transform, Load process to leverage it for a parallel framework either on-premise or on the cloud for high performance results. You will work on big data at rest and big data in motion as well.
Project 2 : Enterprise IT data management
Industry : Information Technology
Problem Statement : Software enterprises have a lot of data and this needs to made sense of in order to derive valuable insights from it
Description : This project involves working with the data warehouse existing in a company deploying the IBM DataStage onto it for the various processes of extract, transform, and load. You will learn how DataStage manages high performance parallel computing. You will learn how it implements extended metadata management and enterprise connectivity. This also includes combining heterogeneous data.
Project 3 : Medical drug discovery and development
Industry : Pharmaceutical
Problem Statement : A pharmaceutical company wants to speed the process of drug discovery and development through using ETL solutions.
Description : This project deals with the domain of drug molecule discovery and development. You will learn how DataStage helps to make sense of the huge data warehouse that resides within the pharmaceutical domain which includes data about patient history, existing molecules, and the effect of the existing drugs and so on. The ETL tool DataStage will help to make the process of drug discovery that much easier.
Project 4 : Finding the oil reserves in ocean
Industry : Oil and Gas
Problem Statement : Finding new oil reserves is a very herculean task. There are huge amounts of data that need to be parsed in order to find where oil exists in the ocean. This is where there is a need for an ETL tool like DataStage.
Description : This project deals with the process of deploying ETL tool like Datastage to parse petabytes of data for discovering new oil. This data could be in the form of geological data, sensor data, streaming data and so. You will learn how DataStage can make sense of all this data.
Intellipaat offers the most in-depth and comprehensive DataStage training that is in line with industry requirements. In this training, you will learn about the DataStage framework to help development and operations teams of leading software enterprises to successfully integrate, communicate, collaborate and automate processes. You will master the skills needed to create a DataStage roadmap, monitor key performance indicators and measure the critical success factors. Upon the successful completion of the training, you will be awarded Intellipaat DataStage Foundation Certification.
This training course equips you with the skills to apply for some of the best jobs in top MNCs around the world at top salaries. Intellipaat offers lifetime access to videos, course materials, 24/7 support and course material upgrading to the latest version at no extra fee. Hence, it is clearly a one-time investment.
This course is designed for clearing the IBM Certified Solution Developer – InfoSphere DataStage. The entire course content is in line with the certification program and helps you clear the certification exam with ease and get the best jobs in top MNCs.
As part of this training, you will be working on real-time projects and assignments that have immense implications in the real-world industry scenarios, thus helping you fast-track your career effortlessly.
At the end of this training program, there will be a quiz that perfectly reflects the type of questions asked in the certification exam and helps you score better marks.
Intellipaat Course Completion Certificate will be awarded upon the completion of the project work (after the expert review) and upon scoring at least 60% marks in the quiz. Intellipaat certification is well recognized in top 80+ MNCs like Ericsson, Cisco, Cognizant, Sony, Mu Sigma, Saint-Gobain, Standard Chartered, TCS, Genpact, Hexaware, etc.
"PMI®", "PMP®" and "PMI-ACP®" are registered marks of the Project Management Institute, Inc.
The Open Group®, TOGAF® are trademarks of The Open Group.
The Swirl logoTM is a trade mark of AXELOS Limited.
ITIL® is a registered trade mark of AXELOS Limited.
PRINCE2® is a Registered Trade Mark of AXELOS Limited.
Certified ScrumMaster® (CSM) and Certified Scrum Trainer® (CST) are registered trademarks of SCRUM ALLIANCE®
Professional Scrum Master is a registered trademark of Scrum.org