In this blog article, we will look at the major distinctions between Fact Tables and Dimension Tables, as well as their functions in data warehousing.
Table of Contents
Unlock valuable insights for your business with business intelligence. Watch this video now
What are Fact Tables and Dimension Tables?
A Fact Table is a key table in data warehousing that maintains quantitative data about a business process or event, such as sales transactions, inventory levels, or customer orders. It covers the major indicators and measures that are used to analyze and assess a company’s performance. Fact Tables feature numerous rows and fewer columns because each row represents a unique instance of an event or transaction, and the columns indicate the many measurements connected with that occurrence.
A Dimension Table, on the other hand, is a supporting table that gives descriptive information about the data in the Fact Table. It comprises traits or features that aid in categorizing or filtering the data in the Fact Table. A Dimension Table for sales transactions, for example, can have attributes such as date, product, store, and customer, which offer context and meaning to the sales data in the Fact Table.
Dimension Tables often feature fewer rows and more columns than Fact Tables because each row represents a distinct dimension value or category, and the columns indicate the many properties associated with that value.
Relationship between Fact Tables and Dimension Tables:
In data warehousing, the relationship between Fact Tables and Dimension Tables is a fundamental aspect of both the star schema and snowflake schema. These schemas are widely used to organize data for efficient querying and reporting in business intelligence applications.
In the star schema, the Fact Table contains a composite key that incorporates keys from the relevant Dimension Tables. These Dimension Tables, in turn, have foreign keys that link back to the corresponding primary key in the Fact Table. This star schema structure simplifies querying and data analysis, as users can easily filter and aggregate data based on the attributes present in the Dimension Tables.
On the other hand, the snowflake schema extends the star schema by normalizing the Dimension Tables. This means that Dimension Tables in the snowflake schema are broken down into multiple related tables to eliminate data redundancy. Despite this normalization, the relationship between the Fact Table and the Dimension Tables is still maintained through the use of keys.
Differences between Fact Tables and Dimension Tables
Here’s a table enumerating the fundamental distinctions between Fact Tables and Dimension Tables:
|Basis||Fact Table||Dimension Table|
|Contents||Numeric values and transactional data.||Categorical data and descriptive attributes.|
|Purpose||Stores quantitative measures and metrics.||Provides descriptive attributes and context.|
|Size||Larger in terms of data volume.||Smaller in terms of data volume.|
|Aggregation||Aggregates data for analysis and reporting.||Provides context for data aggregation.|
|Querying||Provides data for analysis and calculations.||Used for filtering and categorization|
|Examples||Sales transactions and inventory levels.||Date, product, store, and customer dimensions|
|Rows||Many rows.||Fewer rows.|
Examples of Fact Tables and Dimension Tables
Here are a few examples of Fact Tables and Dimension Tables:
An Example of a Fact Table
Consider an Indian e-commerce firm that sells things online and wishes to analyze sales data to get business insights. A Fact Table for this firm may be the “Order” table, which would contain quantifiable information about each order transaction, such as the following:
- Order ID
- Date of order
- Product ID
- Customer ID
- Quantity ordered
- Price per unit
- Total order amount
Each row in this Fact Table will represent an order transaction, while the columns will provide the measurements and metrics related to that particular transaction. The following is how a single row in the Fact Table would look:
This row represents an order that occurred on April 12th, 2000, where the customer purchased three units of Product ID 1555 at a price of INR 1000 per unit, which resulted in a total order amount of INR 3000.
An Example of a Dimension Table
Continuing with the e-commerce scenario, the corporation may wish to provide extra context and classification for its sales data. Dimension Tables come in handy here. One Dimension Table for this corporation may be the “Product” table, which would contain descriptive information about each product sold, such as the following:
- Product ID
- Product Name
Each row in this Dimension Table will represent a unique product, while the columns will provide the attributes related to that particular product. This is how a single row in the Dimension Table would look:
This row represents a product with an ID of 1555, named “chair”, belonging to the “Furniture” category, specifically the “Household” sub-category, is manufactured by the brand “ABC,” and priced at INR 1000.
The company can organize and categorize its sales data by attaching the “Product” Dimension Table to the “Order” Fact Table using a common Product ID key. They may, for example, analyze sales by product category or brand or compute measures such as average product price or profit margin.
Enroll now in Tableau training to learn more about this Business Intelligence tool.
Types of Dimension Tables
Here are the types of Dimension Tables:
Slowly Changing Dimension (SCD) Table
Slowly Changing Dimension (SCD) tables are used to track historical changes in dimensional properties across time. They record changes in qualities such as customer information, product features, or geographic information. Type 1, Type 2, and Type 3 SCDs are the most common, and they are mentioned in brief below:
- Type 1: Replaces the old attribute value with the new value, with no history preservation.
- Type 2: Generates a new row for each modification while maintaining past values via effective date ranges or version numbers.
- Type 3: Inserts columns to keep track of restricted historical changes, often storing the current and past attribute values.
Role-Playing Dimension Table
When a single Dimension Table is connected with various roles or views, Role-Playing Dimension Tables are employed. By generating various aliases or views of the same Dimension Table, a “Date” Dimension Table, for example, may be used to analyze both order dates and shipment dates.
Hierarchy Dimension Table
Hierarchy Dimension Tables provide the hierarchical connection between dimension characteristics. A “Product” dimension, for example, might include a hierarchy with levels such as “Category,” “Sub-category,” and “Product.” This allows for data digging down or rolling up at various degrees of granularity.
Junk Dimension Table
A Junk Dimension Table is used to combine low-cardinality flags or indications into a single Dimension Table. It aids in lowering the number of columns in the Fact Table, simplifying the schema design, and boosting query efficiency.
Conformed Dimension Table
Conformed Dimension Tables are those s that are shared across numerous Fact Tables in a data warehouse. They maintain uniformity and facilitate the integration of data from many sources or topic areas. For example, a “Customer” dimension may be utilized in both sales and marketing data tables.
Types of Fact Tables
The following are the types of Fact Tables:
Transactional Fact Table
Transactional Fact Tables store information about specific events or transactions. They’re utilized to save granular, atomic-level data like sales transactions, purchase orders, and site clicks. Transactional Fact Tables often have a high number of rows and use foreign keys to connect to other Dimension Tables.
Periodic Snapshot Fact Table
Periodic Snapshot Fact Tables collect aggregated data at regular intervals (e.g., daily, weekly, and monthly). They save summarized information regarding a specific time period, such as total monthly sales or weekly website visitors. When compared to Transactional Fact Tables, Periodic Snapshot Fact Tables feature fewer rows, but they provide a greater degree of aggregation for analysis and reporting.
Accumulating Snapshot Fact Table
Accumulating Snapshot Fact Tables capture the state or progress of a process or workflow. They document significant milestones or events that occur during the lifespan of a business process, such as order processing or project management. Accumulating Snapshot Fact Tables allow for the recording and analysis of Key Performance Indicators (KPIs) at various phases of a process.
Factless Fact Table
Factless Fact Tables only include foreign keys that correlate with Dimension Tables with no measurements or metrics. They are helpful when it is necessary to track non-occurring events or combinations of occurrences, such as tracking consumers who did not make a purchase or tracking product pairings that were not sold together. Factless Fact Tables aid in the analysis of data patterns, trends, and exceptions.
In conclusion, Dimension Tables and Fact Tables are critical components of data warehousing. Dimension Tables give descriptive features, whereas Fact Tables provide quantitative data. Slowly changing, role-playing, hierarchy, junk, and conforming dimensions are examples of Dimension Tables.
Fact Tables can be transactional, periodic snapshots, accumulating snapshots, or factless. Understanding these tables is critical for successful data analysis and decision-making. They assist organizations in getting useful insights and understanding their business operations by working together.