• Articles
  • Tutorials
  • Interview Questions

What is Tableau Prep?— Tableau Prep Tutorial

What is Tableau Prep?— Tableau Prep Tutorial

Once you collect all the required data, you will realize that the data is diverse and mixed up. Everything is tracked differently across different files. There are even separate files that are from a specific category that is all the data for just a single year. Sorting through all of that can seem extremely confusing and difficult but Tableau Prep can help take care of this. You will learn about everything there is to know about Tableau Prep in this tutorial.

Check out this Tableau Training for the Beginners video tutorial:

Video Thumbnail

What is Tableau Prep?

Tableau Prep helps with the cleaning, shaping, and organization of data before it can be analyzed. It can gather a considerable amount of data from varied sources and transform them. Its simple drag-and-drop features quickly streamline the complicated tasks of pivots, unions, joins, aggregate, etc. Once the data is clean, it can be subsequently used in the Tableau Prep output as the data source for Tableau Desktop for analysis.

There are sample datasets included in Tableau Prep like the Sample Superstore that you can use to create a flow for Please note that the latest version of Tableau Prep Builder will be used here. Results may vary for previous versions.

Tableau Prep Builder Download and Installation

To download and install Tableau Prep Builder, you will need a Creator product key and the installer. Go to the Customer Portal for the most latest and updated version of Tableau Prep Builder.

The installer can be downloaded from the Product Downloads section according to your operating system. You will get the installers from the Product Downloads and Release Notes page. There is also a free trial version available.

Please note that Tableau Prep Builder is specifically built to work with Tableau Desktop. So, make sure to install it on the same system that runs Tableau Desktop and not the one that is running Tableau Server. Tableau Server Resource Manager (SRM) cannot differentiate between the Tableau Prep protocol server process and the Tableau Server protocol server process. SRM may end the protocol server process of Tableau Prep Builder in case the computer resources are exhausted, and this does not have a recovery mechanism.

Tableau Prep Builder automatically creates a My Tableau <application> Repository folder structure in the documents folder. If you want to move this repository to another location, you can specify the desired location and point Tableau Prep Builder to the new folder. However, changing the location does not move the files in the original repository.

EPGC IITR iHUB

Tableau Prep Tutorial

When you open Tableau Prep, you will see the following screen with the Connections pane:

Tableau Prep Tutorial

First, you will have to connect to a data source. This will create an input step. You can choose to connect to over 25 data sources with Tableau Data Prep. Go to Connections, select Microsoft Excel, and select the downloaded Superstore data file. Once the file is imported, you will see the data as shown below:

Connecting to the Data Source in Tableau Prep

There are various sheets on the left-hand side in the source file (or Excel file, in this case). You have to drag Orders from the left into the white area on the right, and you will see a list as shown below:

Orders in Tableau Prep

You can select any field/fields you want from here, but for the purpose of this tutorial, you will select Category, Order Date, Customer ID, Product ID, and Sales.

Field Selection in Tableau Prep

Once your required fields are selected, you will aim to find a customer’s first purchase date, the sales, categories, and the total distinct Product IDs. You will see different operations to choose from when you expand Order at the top window. Then, you will select Aggregate as highlighted in the image below:

Aggregate in Tableau Prep

Now, you will drag and drop the Customer ID to Grouped Fields on the right and the Order Date to Aggregated Fields.

Grouped Fields and Aggregated Fields in Tableau Prep

Now, you need to choose the Aggregation level for the Order Date as Minimum (see below).

Aggregated Fields in Tableau Prep

Similarly, you will add a new Aggregate (Aggregate 2) from the source orders. This time, Order Date and Customer ID should be brought into the Grouped Fields and Product ID, Sales, and Category into the Aggregated Fields.

Get 100% Hike!

Master Most in Demand Skills Now!

Aggregate 2

You need to make sure to choose the Aggregation level for Sales as Sum and for Product ID and Category as Count Distinct.

Aggregation Level in Tableau Prep

Now, let’s learn how to apply the Join operation. In this case, joining both the aggregated results will give you the first Purchase Date, Count of Products, Sales, and Categories. So, you will select Join from Aggregate 2 as shown below:

Join in Tableau Prep
Join 1

Now, you need to bring Aggregate 1 to Join 1, and you will see the following changes:

Join Aggregates

The next step will be to add the Join condition on Customer ID and Order Date.

Join Clauses

Now, you have all the fields required as well as the duplicate ones for Customer ID and Order Date. You can do away with the duplicates in the cleaning step.

Clean Step in in Tableau Prep
Clean 1

For clarity, it is better to rearrange the data and rename the fields—Order Date as 1st Purchase Date, Product ID as 1st Purchase Products, Sales as 1st Purchase Sales, and Category as 1st Purchase Categories.

Rename Field in in Tableau Prep
Renamed Fields

This gives you Customer IDs with their first date of purchase, sales, and the total distinct products and categories.

Data Science SSBM

Next, you will apply another Join operation on Aggregate 1 and Aggregate 2, but this time, with slightly different criteria. It will have a matching Customer ID, but the Order Dates will not be the same.

Join
Join 2
Order Date in in Tableau Prep

Again, you will add Clean Step and remove the duplicates, Customer ID-1 and Order Date-1. Your screen will look something like this now:

Clean Step 2

Next, you will apply a new Aggregate (Aggregate 3) to Clean 2 to find the minimum Order Date as the first Order Date was removed when Join 2 was applied. You need to drag Customer ID to Grouped Fields and Order Date to Aggregated Fields and change the Order Date level to a minimum.

Aggregate 3
Minimum Order Date

You will create Join 3 after Aggregate 3 and join Aggregate 2 with it on Order Date and Customer ID as given in the image below:

Join 3
Join Aggregate 3 and Aggregate 2
Join 3 Clauses

Now, you have the output as well as the duplicate fields, Customer ID-1 and Order Date-1. You will remove these fields after applying the Clean Step operation (Clean 3).

Clean 3

You will now rearrange and rename the fields, like before, for clarity—Order Date as 2nd Purchase Date, Product ID as 2nd Purchase Products, Sales as 2nd Purchase Sales, and Category as 2nd Purchase Categories.

Renaming Fields

Now, you have two datasets (outputs from Clean 1 and Clean 3) ready at hand. Now, you will combine them using Join on Customer ID as shown in the below images:

Join 4
Join 4 Clauses

You will see a duplicate Customer ID in the Join Results panel. You will remove the duplicate entry by applying Clean Step and then rearrange the data for proper arrangement.

Clean 4

Your final screen will look like this:

Tableau Prep Final Output

You can export the data into either a .csv, .hyper, or .tde file using the Output operation.

Output
Output.hyper
Save Output

Then, you need to click on Run Flow by which you will get a confirmation like below:

Run Flow

Now, this saved file is ready to be used in Tableau for data visualization and further analysis.

Course Schedule

Name Date Details
Tableau Certification Training Course 14 Dec 2024(Sat-Sun) Weekend Batch View Details
21 Dec 2024(Sat-Sun) Weekend Batch
28 Dec 2024(Sat-Sun) Weekend Batch

About the Author

Data Analyst & Machine Learning Associate

As a Data Analyst and machine learning associate, Nishtha combines her analytical skills and machine learning knowledge to interpret complicated datasets. She is also a passionate storyteller who transforms crucial findings into gripping tales that further influence data-driven decision-making in the business frontier.