Architecture of Data Warehouse
Now that we understand the concept of Data Warehouse, its importance and usage, it’s time to gain insights into the custom architecture of DWH.
There are two main components to building a data warehouse- an interface design from operational systems and the individual data warehouse design. Thus, the construction of DWH depends on the business requirements, where one development stage depends on the results of previously developed phase.
The structure of a DWH can be understood better through its layered model, which lists the main components of the data warehousing architecture.
Below is the typical architecture of data warehouse consisting of different important components.
You can see that it is nothing but the movement of data from source to staging area and then finally to conformed data marts through ETL (Extract, Transform and Load) technology.
The first layer is the Data Source layer, which refers to various data stores in multiple formats like relational database, Excel file and others. These stores can consists of different types of data – Operational data including business data like Sales, Customer, Finance, Product and others, web server logs, Internet research data and data relating to third party like census, survey.
The next step is Extract, where the data from data sources is extracted and put into the warehouse staging area. The extracted data is minimally cleaned with no major transformations.
Then comes the Staging area, which is divided into two stages – data cleaning and data ordering. As the name suggests, this layer takes care of data processing methods, i.e. cleaning (removing data redundancy, filtering bad data) and ordering (allowing proper integration) of data. Overall, this stage allows application of business intelligent logic to transform transactional data into analytical data. It is indeed the most time consuming phase in the whole DWH architecture and is the chief process between data source and presentation layer of DWH.
Read these Top Trending Data Warehouse Interview Q’s that helps you grab high-paying jobs !
Finally, we have the Data Presentation layer, which is the target data warehouse – the place where the successfully cleaned, integrated, transformed and ordered data is stored in a multi-dimensional environment. Now, the data is available for analysis and query purposes. The information is also available to end-users in the form of data marts.
This data can then be accessed by various Business Intelligence tools like Tableau, Business Objects, and presented in multiple formats like tables, graphs, reports and others.
It is important to note that the data warehouse supports and holds both persistent (stored for longer time) and transient/temporary data. The major purpose of a data warehouse is the attainment of cleansed, integrated and properly aligned data so that it is easy to analyze and present to clients and customers in several businesses.
Watch This Video by Intellipaat
Further, since corporate and organizations in every sector deal with large amounts of data referred to big data, building a data warehouse is a must-have. For instance, every customer that has ever visited a website gets recorded along with each detail. Thus, all the information available is sliced (divided) into smaller fragments and then diced (analyzed and examined).