Data Warehouse Tutorial for Beginners
This is a free tutorial that serves as an introduction to help beginners learn the various aspects of data warehousing, data modeling, data extraction, transformation, loading, data integration and advanced features. This includes free use cases and practical applications to help you learn better.
This Data Warehouse Tutorial also covers data warehouse overview, data warehousing terminologies, DWH architecture, ‘What is ETL?’, Business Intelligence life cycle, DWH technology, OLAP and multidimensional models, DWH external/unstructured data in warehouse, merits and demerits of using DWH, DWH quality management and tools used and their advantages, star schema, and snowflake schema.
Check out this all-inclusive Data Warehouse Tutorial Video:
Following are the benefits of deploying a strong data warehousing technology:
- Stored data is processed using statistical analysis, reporting, etc.
- Production strategies can be fine-tuned using the data warehouse information
- Customer buying preferences can be possible by analyzing DWH information.
Data Warehouse (OLAP) vs. Online Transaction Processing (OLTP)
As we know that data warehouse works on OLAP (Online Analytical Processing) which is exactly different from OLTP (Online Transaction Processing). Following comparison will give show why OLAP is a way better concept than the conventional OLTP:
|Basis of difference||DWH (OLAP)||OLTP|
|Data types||Transactional data||Data from multiple sources|
|Suitable for||Basic business operations||Predicting future trends and analysis|
|Storage size||From 100 GB Upto 1 TB||100 MB to 1 GB|
|Number of records accessed||Millions||Fewer|
|Dimensional view of data||Multidimensional view||Flat relational view|
Looking for top jobs in Business Intelligence? This blog post gives you all the information you need!
Table of Contents
Data Warehouse Overview
Why Data Warehouse?
The famous author of several Data Warehouse books, William H. Inmon first coined the concept of Data Warehouse (DW) in 1990. Inmon defined data warehouse as ‘a subject-oriented, integrated, time-variant and non-volatile collection of data.’ Extremely useful for Data Analysts, this data helps them to take business decisions and other data-related decisions in the organization. Data Warehouse Read More
Data Warehousing Terminologies
Data Warehouse Terms
This section will illustrate important and most frequently used terms used in Data Warehousing. Metadata In simple terms, metadata provides the answers to all your data-related questions in the data warehouse. It refers to data about data giving users detailed explanation of of the syntax and semantics and describing all relevant attributes of the data in DWH. Read More
Architecture of Data Warehouse
Now that we understand the concept of Data Warehouse, its importance and usage, it's time to gain insights into the custom architecture of DWH. There are two main components to building a data warehouse- an interface design from operational systems and the individual data warehouse design. Thus, the construction of DWH depends on the business requirements, Read More
What is ETL?
What are ETL Tools?
ETL is the process of transferring data from source database to the destination data warehouse. In the process, there are 3 different sub-processes like E for Extract, T for Transform and L for Load. The data is extracted from the source database in the extraction process which is then transformed into the required format and then Read More
Business Intelligence Lifecycle
Phases of Business Intelligence
While data is captured in complex structures and databases to facilitate specific transaction requirements, organizations and businesses find it difficult to extract and capture the required information from data in transaction systems. Thus, there was a need to develop a system that can dependably take out data from the source systems and restructure the content appropriate Read More
Popular Data Warehouse Technologies
Interestingly, data warehousing requires simple technology features than the previous data-driven systems. For instance, DWH does not require online updating and demands minimal locking needs. This section delineates the technological requirements and features for the DWH. Data Warehouse Tutorial Video [videothumb class="col-md-12" id="uQDpwjaRcVk" alt="Data Warehouse Tutorial" title="Data Warehouse Tutorial"] Manage large amounts of Data As discussed Read More
OLAP and Multidimensional Model
What is OLAP?
Most times used interchangeably, the terms Online Analytical Processing (OLAP) and data warehousing apply to decision support and business intelligence systems. OLAP systems help data warehouses to analyze the data effectively. The dimensional modeling in data warehousing primarily supports OLAP, which encompasses a greater category of business intelligence like relational database, data mining and report writing. Data Read More
DWH External/Unstructured Data in Warehouse
Data Warehousing and the Unstructured Data
As we have discussed so far, it is clear that most enterprises build data warehouse using the data available within the internal source systems. Besides available internally in the organization, this data is structured and has been configured in a regular format. However, we could sometimes encounter chunks of data that is useful for Read More
Merits and Demerits of Using DWH
Advantages of Data Warehouse
Being a subject-oriented, integrated, time-variant and volatile, data warehousing caters several advantages to enterprises and users when implemented for business purposes. The successful application of DWH delivers great results and improves the overall functioning of every organization. Data Warehouse Tutorial Video [videothumb class="col-md-12" id="Dz0kKer5ApE" alt="Data Warehouse Tutorial" title="Data Warehouse Tutorial"] Delivers Enhanced Business Intelligence With data Read More
DWH Quality Management
How to manage data quality in the Data Warehouse?
With the aim of building a subject-oriented and integrated database, data warehouse is the key ingredient in taking quick and intelligent business decisions. The quality of data and data warehousing is thus a critical concern for the organizations. Moreover, since the data in the warehouse gets collected from multiple disparate sources, Read More
What is Star schema?
The implementation of a data warehouse and business intelligence model involves the concept of Star Schema as the simplest dimensional model. Similar to every other dimensional model, star schema consists of data in the form of facts and dimensions. The reason behind the name 'Star Schema' is that this data model resembles a star with ends Read More
What is Snowflake Schema?
In data warehousing, Snowflake Schema is the extension to star schema such that the tables are arranged in a complex snowflake shape. The concept is similar to star schema with a center fact table and multiple dimension tables radiating from the center except that the tables described as dimensions are normalized and consist of more hierarchies. Read More
The Open Source ETL Tools
Popular open source ETL tools
In the ETL processes, the tools are used for extracting data from different sources, transforming the data to different structures so that they fit into the data warehouse and then finally loading the filtered data to the destination databases. There are a large number of freely available tools. And they are: Apatar : It is another Read More
The ETL Testing Process
Testing Process in ETL
ETL, the process used during the transferring of data between databases is one of the significant concept in data warehousing. This process of ETL consists of sub-processes like Extracting of the data from the source database, transforming the extracted data to the format required to be accepted into the destination database and then finally loading the Read More