Intellipaat
Intellipaat

ETL Tutorial for Beginners

ETL Tutorial for Beginners
 1132 Views     0 Comment(s)

What is ETL ?

ETL in Computing stands for Extraction, Transformation and Loading.These terms are mainly used in Data warehousing.In this process data is extracted from different sources, transformed to a required format and finally loaded to a Target. Use of ETL cannot be described with any particular example as it varies from situation to situation. 

Criteria ETL
ETL Source file Database
ETL Destination file Data Warehouse
Parallel ETL tasks possible YES

ETL Lookup Stage :

ETL Lookup stage enables us to evaluate data with various options, but it cannot be used in case of a giant database as data can be analyzed only when it is in memory. But it is more preferable as compared to Joint and Merge stages. Additionally lookup stage also permits further condition based data-analysis. 

Properties of Lookup Stage:

  • It is a processing stage.
  • Works on memory by reading data from a data set.
  • Direct lookup on DB2 and Oracle can also be performed.
  • Lookup is also used for validation of Rows. It rejects a Row if it doesn’t not have a corresponding entry.
  • Lookup stage can have single or multiple reference link, Single input and Single output link. 

This Stage can be worked with 3 options:

  1. Equality Match: -This is also known as normal look. Data is looked with exact Case sensitive match.
  2. Casesless Match: -This function looks for values with no importance to Case Sensitivity.
  3. Range Match: -With the help of range function, a Lookup stage can be configured to look for a range of values between two lookup columns. 

Lookup stage is good for small amount of data as compared to join and merge stage, as lookup stage uses memory to process data it is not possible to use lookup stage for a huge amount of data. For databases or data sets with huge values Join Stage or Merge Stage is used. Drawback of join stage include the fact that it cannot reject a row with no corresponding entry.  Merge Stage can help us reject the values. 

Lookup Toolbar:

(ETL) Lookup toolbar has the following buttons:-

  • Stage properties:- Helps to define various stage properties like Stage Name, Link Properties
  • Conditions :- All conditions can be defined with the help of this button
  • Show all selected relation ;- Shows all selected relations
  • Cut/ Copy / Paste:- General Functions
  • Load / Save Column definitions: -Loading and saving of column data.
  • Find / Replace:- General Functions
  • Column Auto Match: – Helps to match column automatically depending on mapping.
  • Input / Output Link Execution Order: -Displays the links used for execution, helps in re-ordering.

Apart from lookup toolbar, we also have some editor components.

Link Area shows the details of Input and Output links.

Metadata Area shows metadata details of column and input and output links.

Shortcut menus assists in opening all toolbar function at one place. It also helps to validate and clear details, append a new column to a selected link. Select all columns in a link.

All lookup stage conditions can also be checked using the shortcut menu.

 

Related Articles

Suggested Articles