0 votes
1 view
in BI by (36.9k points)

I have imapla table which contains voluminous records(39885593) and needs to create the dashboard using the impala table via Tableau.

I tried to achieve this requirement in multiple ways as below

1) Extracted the data from impala table in Tableau extract and then creating the dashboard. 2) Use the Data extract initially and then switch the connection to Live connection 3) Live Connection

Approach 1: Able to create the dashboard with data extract. Gives good performance. The problem with this approach is data is transactional data which grows every day so if I go via this approach data extract is going to take more space in the Tableau Server.

Approach2: Using this approach I am able to design the dashboard efficiently however when I switch the connection from data extract to Live and publish the dashboard it takes a lot of time to publish the dashboard also when I view the dashboard via Tableau server for opening the dashboard in the browser takes more time.

Approach 3: Live Connection gives a very slow performance while designing and publishing the dashboard.

If anyone has come across this kind of requirement can you please provide me the suggestion for the same.


1 Answer

0 votes
by (13.4k points)

  • Unless you need up to the minute live access to millions of transaction records, I recommend working with extracts (possible multiple extracts)

  • But you need to reduce the size of your extracts to the minimum needed to support your visualization. You can also add data source filter and hide unused fields, rollup data to aggregate in the extract to just the level of detail needed for your view.

  • For large data sets, don't try to make a single extract that is just a copy of your entire data set, make several smaller ones that each support information which is needed for one or a small set of related views. Think of an extract like a materialized view.

  • If a view only displays 100 marks, then strive to have only 100 records in the extract that it uses, even if those are 100 records summarize info from 100 million in the underlying data source.

  • Then you can have a larger extract or even a live source for people to use when drilling down into a (filtered) detail view, and the first views of your dashboard can launch quickly.

  • This way interactivity, refreshes, and publishing can be fast.

  • For this approach to work, you may need to get used to having multiple data sources in your workbook, even if based on the same database. And also using filter actions, parameters and calculated fields to filter and link across data sources.