bing
Flat 10% & upto 50% off + 10% Cashback + Free additional Courses. Hurry up
×
UPTO
50%
OFF!
Intellipaat
Intellipaat

SSIS User Handbook

If you are looking for a tool to automate maintenance of SQL Server database which can perform data integration and workflow application, for extraction, transformation and loading of data. Then SSIS can be taken into consideration. This cheat sheet will guide you with the basic concepts which is required to get started with it. It is a handy reference sheet for the beginners and the one having knowledge about the BI tools.

You can also download the printable PDF of this SSIS cheat sheet.

SSIS FINAL CHEAT SHEET

SQL Server Integration Service (SSIS):

SSIS is a component of Microsoft SQL Server database which can be used to perform data migration tasks, by collecting the data from various data sources and storing it in the central location. The main function performed by SSIS is Extraction, Transformation and Loading

SQL Server Integration Service (SSIS)

SSIS Architecture

  • Extraction: It is the collection of data from various sources
  • Transformation: It is a different form of data collected from different sources and is converted according to the Business requirement
  • Loading: Data warehouse is used which contains the loaded data
  • Data warehouse: It captures the data from various sources for useful analysis and access
  • Data Warehousing: It is a collection of data accumulated which is used for assembling and managing data from various sources for better decision making in the Business

Features of SSIS:

  • Studio environment: SSIS includes two studios
    • SSDT: Referred to as SQL Server Data Tools which is used to develop integration service packages and perform several tasks like
      • Copying basic package data from source to destination.
      • Updating the properties of packages while run time.
      • Creating deployment.
      • Saving the package copies to SQL Server
  • SSMS: Referred to as SQL Server management studio which is used for the package management in a production environment and perform tasks such as
    • To create a folder for package management
    • Executive Package utility is used to run the packages and store on a local system
    • Generates a command line while running the executive package utility
    • To store and fetch packages from SQL Server
  • Packages: SSIS package is a collection of control flow and data flow. Control flow includes tasks and data flow task and data flow including source, transformation and destination
  • Expressions: SSIS Expressions are the combination of literals, identifiers and operators
  • Event Handling: Handling a workflow that is designed for the number of events in different ways where it might occur. It is similar to building the packages

Uses of SSIS:

  • It is used to combine data from multiple data sources
  • Populating data warehouses and data marts
  • For Cleaning and standardizing data
  • To Build BI into the data transformation process
  • Used for automating administrative functions and loading the data

SSIS Date/Time Data Types:

  • DT_BOOL: It is a 1-bit Boolean Value
  • DT_BYTES:  It is a binary data value and the length is variable, maximum length is up-to 8000 bytes.
  • DT_CY: It is a currency value and this data type is an eight-byte signed integer with a scale of 4 and a maximum precision of 19 digits
  • DT_DATE (Format: yyyy-mm-dd): It is a date structure that consists of year, month, day, hour, minute, seconds, and fractional seconds. The maximum scale of fractional seconds is 7 digits
  • DT_DBDATE: A date structure that consists of year, month, and day.
  • DT_DBTIM (Format: hh:mm: ss): A time structure that consists of hour, minute, and second.
  • DT_DBTIME2 (Format: hh:mm: ss[.fffffff]): A time structure that consists of hour, minute, second, and fractional seconds. The maximum scale of fractional seconds is 7 digits
  • DT_DBTIMESTAMP (Format: yyyy-mm-dd hh:mm: ss[. fff]): A timestamp structure that consists of year, month, day, hour, minute, second, and fractional seconds. The maximum scale of fractional seconds is 3 digits
  • DT_DBTIMESTAMP2 (Format: yyyy-mm-dd hh:mm: ss[.fffffff]): A timestamp structure that consists of year, month, day, hour, minute, second, and fractional seconds. The maximum scale of fractional seconds is 7 digits
  • DT_DBTIMESTAMPOFFSET (Format: yyyy-mm-dd hh:mm: ss[.fffffff] [{+|-} hh:mm]):A timestamp structure that consists of year, month, day, hour, minute, second, and fractional seconds. The maximum scale of fractional seconds is 7 digits
  • DT_DECIMAL: An exact numeric value with a fixed precision and a fixed scale. This data type is a12-byte unsigned integer with a separate sign, a scale of 0 to 28, and a maximum precision of 29
  • DT_FILETIME (Format: yyyy-mm-dd hh:mm:ss:fff): A 64-bit value that represents the number of 100-nanosecond intervalThe maximum scale of fractional seconds is 3 digits
  • DT_GUID: It is a globally unique identifier (GUID).
  • DT_I1: It is a one-byte, signed integer.
  • DT_I2: It is a two-byte, signed integer
  • DT_I4: It is a four-byte, signed integer.
  • DT_I8: It is an eight-byte, signed integer
  • DT_NUMERIC: An exact numeric value with a fixed precision and scale. This data type is a 16-byte unsigned integer with a separate sign.
  • DT_R4: It is a single-precision floating-point value.
  • DT_R8: It is a double-precision floating-point value
  • DT_STR: It is a null-terminated ANSI/MBCS character string with a maximum length of 8000 characters
  • DT_UI1: It is a one-byte, unsigned integer
  • DT_UI2: It is a two-byte, unsigned integer
  • DT_UI4: It is a four-byte, unsigned integer.
  • DT_UI8: It is an eight-byte, unsigned integer.
  • DT_WSTR: It is a null-terminated Unicode character string with a maximum length of 4000 characters
  • DT_IMAGE: It is a binary value with a maximum size of 2^31 -1 bytes
  • DT_NTEXT: It is a Unicode character string with a maximum length of 2^30-1 characters
  • DT_TEXT: An ANSI character string with a maximum length of 2^31-1 characters

Transformations available in SSIS:

  • Aggregate – It is used to apply aggregate functions to Record Sets to produce new output records from aggregated values.
  • Audit – It is used to add Package and Task Level Metadata – such as Machine Name, Execution Instance, Package Name, Package ID, etc.
  • Character map –It is used to Performs SQL Server operations such as changing data from lower case to upper case.
  • Conditional split – It is used to separate available input into separate output pipelines based on Boolean Expressions designed for each output.
  • Copy column – It is used to add a copy of column to the output so that we can later transform the copy keeping the original for auditing.
  • Cache: It is used to write the data from connected data source to a cache file
  • Data conversion – It is used to convert columns data types from one type to another type. It stands for Explicit Column Conversion.
  • Data mining query – It is used to perform data mining query against analysis services and to manage Predictions, Graphs and Controls.
  • Derived column – Used to create a new column from given expressions.
  • Export column – It is used to export an Image specific column from the database to a flat file.
  • Fuzzy grouping – It is used for data cleansing by finding rows that are likely duplicates.
  • Fuzzy lookup – It is used for Pattern Matching and Ranking based on fuzzy logic.
  • Import column – It reads image specific column from database onto a flat file.
  • Lookup – It performs search of a given reference object set against a data source and is used for exact matches only.
  • Merge – It is used to Merges two sorted data sets into a single data set into a single data flow.
  • Merge join – It is used to merge two data sets into a single dataset using a join junction.
  • Multi cast – It sends a copy of supplied Data Source onto multiple Destinations.
  • Row count – It is used to store the resulting row count from the data flow / transformation into a variable.
  • Row sampling – It captures the sample data by using a row count of the total rows in dataflow specified by rows or percentage.
  • Union all – It is used to Merge multiple data sets into a single dataset.
  • Pivot – It is used for data source normalization by converting rows into columns
  • Unpivot – It is used for de-normalizing the data structure by converting columns into rows in case of building Data Warehouses.

SSIS Expressions:

Some common expressions used in SSIS packages

StatementExpression
Create a file name with today’s dateExpression on the Flat File or File Connection Manager:
“C:\\Project\\MyExtract”+(DT_WSTR,30)(DT_DBDATE)GETDATE() + “.csv”
Expression Output Example: C:\Project\MyExtract2009-03-20.csv
Use a 2 digit dateRIGHT(“0” + (DT_WSTR, 2)MONTH(GETDATE()),2)
Expression Output: 03 (if the month is March)
Multiple condition if statementIn this example, the statement determines that if the ColumnName is blank or NULL, it will be set to unknown. To make a Logical AND condition, use “&&” instead of the “||” operator.
 ISNULL(ColumnName)||TRIM(ColumnName)==””?”Unknown”: ColumnName
Returns the first five characters from a zip codeDerived Column Transform in the Data Flow:
SUBSTRING(ZipCodePlus4,1,5)
Remove a given character from a stringDerived Column Transform in the Data Flow:
REPLACE(SocialSecurityNumber, “-“,””)
Uppercase dataDerived Column Transfrom in the Data Flow:
UPPER(ColumnName)
Replace NULL with another valueDerived Column Transform in the Data flow:
ISNULL(ColumnName)?”New Value”: ColumnName
Replace blanks with NULL valuesDerived Column Transform in the Data Flow:
TRIM(ColumnName)==”” ? (DT_STR, 4, 1252)NULL(DT_STR, 4, 1252) : ColumnName
Remove any non-numeric data from a columnScript Transform in the Data Flow Task with the code as follows:
Imports System.Text.RegularExpressions
Public Overrides Sub Input()_ProcessInputRows(ByVal Row As Input()Buffer)
If Row.ColumnName_IsNull = False Or Row.ColumnName = “” Then
Dim pattern As String = String.Empty
Dim r As Regex = Nothing
pattern = “[^0-9]”
r = New Regex(pattern, RegexOptions.Compiled)
Row.ColumnName = Regex.Replace(Row.ColumnName, pattern, “”)
End If
End Sub
Convert text to proper caseScript Transform with the line of partial code as follows:
Row.OutputName = StrConv(Row.InputName, VBStrConv.ProperCase)
Build dynamic SQL statementExpression on the SQLStatementSource property of Execute SQL Task(VB 2008):
“SELECT Column From ” + @[User::TableName] +WHERE
DateFilterColumn = ‘” + (DT_WSTR,4)YEAR(@
[User::DateTimeVar]) + RIGHT(“0” + (DT_WSTR,2)MONTH(@
[User::DateTimeVar]),2) + RIGHT(“0” + (DT_WSTR,2)DAY(@
[User::DateTimeVar]),2) + “‘”
Calculate beginning of the previous monthExpression on component or task:
(DT_DATE)(DT_DBDATE)DATEADD
(“dd”,-1 * (DAY( GETDATE() )-1), DATEADD(“month”,-1, GETDATE() ))
Round to the nearest two decimal markExpression on Derived Column Transform:
ROUND(YourNumber, 2)
Expression Output Example: 1.2600000

Download a Printable PDF of this Cheat Sheet

With this, we come to an end of SSIS Cheat sheet. To get in-depth knowledge, check out our interactive, live-online SSIS certification Training here, that comes with 24*7 support to guide you throughout your learning period. Intellipaat’s SSIS online training will make you proficient in gathering data from flat files, XML, and relational data sources. It involves working on data extraction, integration and transformation to align with business needs

Previous Next

Download Interview Questions asked by top MNCs in 2019?

"0 Responses on SSIS Cheat Sheet"

    Leave a Message

    100% Secure Payments. All major credit & debit cards accepted Or Pay by Paypal.
    top

    Sales Offer

    Sign Up or Login to view the Free SSIS Cheat Sheet.