Spark Thrift Server and Tableau

Question

asked Jul 30, 2019 in BI by Ashok (47.2k points)

I have been successful in integrating Tableau with Spark Thrift Server using Samba ODBC. I have tried using the cache table during the Initial SQL and the performance has been great till now. I am now looking for a way to cache and un cache a few of the frequently used tables when they are updated using through our data pipelines.

The challenge that I am facing is that the cache table done via Tableau will remain in cache through the lifetime of the thrift server but when I write my data pipeline process and submit spark jobs it will use a different spark context. Can anyone please suggest how can I connect to the thrift server context through the backend process.

Is there a way to re-use the thrift services from spark-submit or spark-shell?
At the end of my data pipeline will it be a good idea to invoke a simple shell script that will connect to the thrift service and refresh the cache?

Note: both my backend and the BI tool are using the same cluster as I have used the same yarn cluster while starting the thrift service as well as submitting the backend jobs.

1 Answer

Vaibhav Ameta · Answer 1 · 2019-07-31T21:29:04+0000

By using beeline, you can try to connect to the Thrift Service on the same cluster using the same URL & credentials. Once the data Pipeline completes run

UNCACHE TABLE MyTable; CACHE TABLE MyTable;

Spark Thrift Server and Tableau

1 Answer

Related questions

Browse Categories

Browse By Domains

Popular Courses

Popular Tutorials

Popular Resources