Tableau Data Extracts & Tableau Data Sources

Overview

Tableau Data Extracts are quick and easy to manage, but if your organization has a large data set that needs to be shared, a Tableau Published Data Source is an excellent solution.

Tableau Desktop allows the user to connect to and analyze many types of data sources.  

  • Excel files
  • JSON
  • CSV files
  • SQL Databases
  • Salesforce
  • Zuar Runner
  • And more...  

When a user establishes a connection to a data source, a couple of choices face the user. This article covers two options for managing this data, Tableau Extracts* and Tableau Published Data Sources.

Need help managing these data sources and extracts? We are Tableau masters!

What are Tableau Data Extracts?

Tableau data extracts are a “snapshot” of data that is compressed, stored, and loaded into memory.

Understanding a Tableau Data Extract

The best way to understand a Tableau data extract is to look at an example scenario: 1. User 1 will connect to the Superstore PostgreSQL Table. Since this is an extensive database, a data extract will be created. Multiple members of the organization can use this table.

Connection to PostGreSQL 

2. Some sheets/dashboards are then created using this data extract:

3. User 1 then publishes the Dashboard to Tableau Server and sets up a refresh extract to refresh this dashboard from the database every hour:

4. The next day User 2 comes along and makes the same connection with same data extract schedule to the same database table but creates a slightly different Dashboard:

5. So now there are two dashboards with two data extracts from the same database table. The data extract is also querying the database twice when it really only needs to be querying it once.

Disadvantages of a Tableau Data Extract

Based on this example, we can address a couple of disadvantages of Tableau data extracts.

Redundant Multiple Data Source Queries

If dashboards continued to be published in this manner the following situation arises.

Figure I: Tableau Data Extracts. Note that each extract in 1,2,3 or 4 could contain exactly the same data. However the data cannot be shared between workbooks.

Querying Inefficiencies and Inconsistencies

In addition to this, on day 3, User 1 goes into her dashboard and creates the following calculation for Sales Commission, and uses this figure on a dashboard:

Sales Commission Calculation for User 1

Unknown to each other and not to be outdone User 2, goes in and creates the same calculation, but uses a rate of 15%, and uses this figure on a dashboard:

User 1 and User 2 have the same boss, who looks at both User's dashboards.  The boss is confused.  Not only do we have inefficiencies with the querying of the same data, but now we have inconsistencies in the actual data.

This is where a Tableau Published Data Source can help.

What is a Tableau Published Data Source?

A Tableau published data source is a centralized source that allows users to share data connections that they have defined. The Tableau source establishes a single source of truth and allows users to have confidence in the extracted data they are analyzing.

Understanding a Tableau Published Data Source

The best way to understand a Tableau Published Data Source is to look at an example scenario:

  1. The Boss sits User 1 and User 2 down and gets them to agree on how commission should be calculated.  
  2. They agree that it's actually 12.5%.  

Two Options for Data Extracts Management

In this situation, there are two options to manage and maintain consistent data extracts:

  1. User 1 and User 2 now have to keep their Data Extracts in sync manually.
  2. Use a Tableau Published Data Source. Once they agree to the structure of the data, they can both access the same information with efficiency and correctness.  

How a Tableau Published Data Source Works

A Tableau Published Data Source does start with a data extract, but once all the checks have been made, it needs to be published to Tableau Server or Tableau Cloud:

Publish to Server...
  1. Name the Data Source appropriately and put it in the appropriate project.
Name the datasource appropriately

2.  The data source now becomes available on Tableau Server.  If the user has permission they can even create a Workbook using web edit with that data source.

However, if you're using Tableau Desktop  to build your Dashboards, then you can connect to the data source.

Connect to Tableau Data Source in Tableau Server
Select the verified data source

3.  So now, User 1 and User 2 can connect to this new Verified Data Source and use it as an established single source of truth of the data.  

User 1 and User 2 can now connect to the Verified Data Source

The data pipeline has also been made more efficient as the data source is refreshed once from PostGreSQL, not by every workbook that is connected to it. The pipeline from PostGreSQL to Dashboard now looks like this:

Tableau Published data source only queries the data source once and becomes the Single Source of Truth for all connected workbooks

Certify the Published Data Source

As an additional step, you need to verify that the data source is indeed the source of truth. User 1 or User 2, or even their Boss, if permissions allow, can certify this Published Data Source by:

  1. Clicking on the details icon in the Tableau Server Data Source Page:
Users with appropriate permissions can Certify the data source

2. Certifying the data source puts a little green tick on the icon:

Certified Data Source

Other users can then be confident about using this Data source.

Data Quality Warnings

Users with appropriate permissions can also place warnings on Published Data Sources. This can be useful during periods when the data source might be going through changes:

Data Quality Warnings can notify your users about the state of the data

If a Published Data Source has a warning on it, any attempt to connect to it, displays a warning:

Best Practices for Maintaining Tableau Published Data Sources

It does take some extra management to maintain a Tableau Published Data Source. At Zuar, the recommended practice is to separate the Published Data Source workbook from the workbook that is intended to be worked upon.

For Example:

  1. User 1 creates dashboards but is also the Data Custodian over the data source created above.
  2. User 1 creates dashboards in Sales Analysis.twb. This workbook has a connection to the Tableau Published data source. 
  3. User 1 should maintain a separate workbook (i.e. Master Verified Data Source - super_store_orders.twb) where the original Tableau Extract for the verified data source is contained.  It is from this workbook, and only this workbook publishing to the Published Data Source on Tableau Server should be performed.

Common Pitfalls of using Tableau Published Data Sources

There are several common pitfalls of using Tableau Published Data Sources that you should be aware of before adopting them into your system:

A user cannot edit calculated fields from a Tableau Data Source
  • Calculated fields can be added to a given workbook, but they will be local to that workbook only and will not be available to others using that published data source.  If the calculation needs to be added, it can be added in the Master workbook discussed in point 3 above
  • Be careful with calculations involving row level security. Row level security must be local to the workbook to be secured. Leave row level security calculations out of the Published Data Source and handle these at a workbook by workbook level.
  • Dimension Aliases also need to be handled at the Master Data Source layer.  

Tableau Workbooks and Sheets

Tableau workbooks are where you are going to store your collections of data. Worksheets contain data sets within the workbook. The dashboard is where you can view a collection of data from multiple worksheets.

A “story” includes an organized series of worksheets or dashboards that contain data sources that relate to each other. Tableau workbooks are a great way to keep all of your data sources and extracts organized for maximum efficiency.

Tableau Data Extracts vs. Tableau Data Sources

There is a place for both Tableau extracts and Tableau Published Data Sources. It just depends on the use case.

Data Extracts

Data extracts are essentially a snapshot that is saved to your system memory and can be recalled quickly for visualization. This offers a much faster access to your workbooks.

Tableau Published Data Sources

Tableau Published Data Sources are a great way to centralize data, establish a single source of truth, and allow users to have confidence in the data they are analyzing.

These Tableau Data Sources offer real time-updates for your data, but the information is pulled straight from the database instead of your local memory; the performance usually isn’t as fast as Tableau extracts.

Criteria Tableau Extracts Tableau Data Sources
Design Snippets of data Centralized Data
Performance Faster because data is saved in memory Slower because pulling data from database
Speed Faster access to workbooks Real-time updates for data
Trusted Sources Snapshot of some data Single source of truth for all data

Still confused about Tableau extracts and data sources? Get in touch with Zuar today!

Learn about our Portal products for Tableau:

Zuar | Data Portal
Learn why you need a visual analytics portal, how you can brand your own with Zuar, and our scalable portal pricing plans. Start your 2 week trial today.

*Generally a Tableau Extract is used for large data sources.  Live connections can be used with small (or optimized) data sources.

References:

  1. https://www.zuar.com/blog/tableau-action-filters-embedded-analytics/
  2. https://help.tableau.com/current/pro/desktop/en-us/publish_overview.htm
  3. https://www.zuar.com/blog/solved-tableau-dashboards-arent-showing-up-in-chrome/
  4. https://www.zuar.com/blog/tableau-tip-date-calculation-power-of-most-recent/
  5. https://www.zuar.com/blog/embedded-analytics-how-to-embed-tableau-dashboards-into-a-web-page/
  6. https://www.zuar.com/blog/implementing-trusted-tickets-for-tableau-server-with-nodejs/

How to Embed Tableau Into Salesforce With SSO | Zuar
Embed Tableau Into Salesforce With Zuar’s data portal product. Zuar Portal is an easy way to provide branded Tableau dashboards.
Access Your Tableau Analytics from Anywhere, Without a VPN
Whether you are having to make tough decisions about your business or experiencing high demand and growth, data driven decision making should become a top priority for any business that is navigating a volatile market.