Tableau Performance: The Background
One of our larger customers purchased an on premise Tableau Server which is hosting Tableau views fed from about a hundred on site manufacturing plants to manage and visualize local plant level manufacturing data in addition to an executive rolled up corporate level suite of dashboards, hosted in a Custom portal, accomplished with an automated Mitto hosted data pipeline...or more simply "The Z stack" and as we like to call it.
In this real world customer use case the solution stack also hosts row level security and data write back capability so that individual sites can both see and dynamically update and view site level data as changes are made. In creating this solution there were many considerations at play and for this discussion I will walk through some of the steps made to optimize the Tableau Server performance for this solution.
Tableau Server Configuration
When optimizing the configuration of Tableau Server there are two main process considerations that consume CPU cycles namely Backgrounder and VizQL Server. A little background here is that one of the behaviors of the Tableau Server Backgrounder process is that when active, it will consume all available CPU cycles on a cluster while completing scheduled tasks. For this reason it is common to isolate the Backgrounder processes on separate server nodes in order to ensure that VizQL is not adversely starved effecting rendering performance and user interaction experience. Conversely this separation also allows for increasing VizQL processes and resources to directly influence performance. If you are new to Tableau Server processes, check out the overview of each Tableau Server process or the Scalability Whitepaper for more in depth discussion of process interplay.
Due to the row level security and included live update capability the design does not include the use of extracts, so in our case isolation of Tableau Server Backgrounders is not a consideration for the server configuration strategy. This application is also not positioned as a mission critical component so a multi-node High Availability (HA) design with cluster failover is also not necessary. For reference if you are curious about other scenarios, check out the Tableau Performance Tuning overview of Tableau Server Performance considerations and scenario configurations.
When this project began the Tableau Server was configured in a two node set up with a 16 core license and all licensed processes hosted on the second node. The request was to ensure that Tableau server was performing as well as possible to support the overall solution and optimize the investment. With this environment and background in place here are the changes we made to optimize this configuration and maximize performance.
Cluster configuration was changed to simplify to a single node configuration. As mentioned backgrounders are not at play in a significant way so VizQL is free to consume available CPU resources. The single node also removes possibility of any node to node cluster latency due to network traffic with nodes hosted are on seperate hardware. A note here that when working in a virtual environment CPU cores are not the same a vCPUs, which is discussed in the "Note:" on Tableau Recommended Baseline Configuration page. The best way that I have found to see how many CPU cores Tableau is 'seeing' is to enter the follow on the command line (Windows since this is an Azure article).
"WMIC CPU Get NumberOfCores,NumberOfLogicalProcessors /Format:List"
This simple command can streamline dialogs with IT departments to ensure that everyone is talking about the same numbers rather than worrying about what vCPU or hyper-threading mean. Use the "Number of Cores" value for any Tableau Server sizing conversation. This number is also the number Tableau uses for computing core licensing.
Our CPU and Ram setup is the default for a 16 core Tableau Server, which is calculated as 8 GB per CPU core to arrive at 128 GB Ram.
For disk space requirements the baseline Tableau configurations are listed at 50 GB for OS with additional space to hold extracts and repository data based on environment size. Interesting absent from this page is a mention of disk type and speed from the recommendation given the impact we will show has on server performance.
Tableau Performance Tuning and Testing
By default Tableau Server contains pre-built administrator performance related views when logged into server as an Admin under the 'Server Status' section. These views allow for looking at things like the 'Performance of Views', a workbook related metric, or 'Stats for Space Usage' on the hardware performance side as outlined here.
All of these admin views are actually contained in a single tabbed workbook called 'tabbed admin views.twb' on the server which can be republished back to server as a regular tableau workbook after setting up a connection the Tableau Repository (Tableau Servers internal PostgreSQL DB). Details on setting up access to the Repository are found here, the Repository connection information is required when the workbook is reopened, so make sure to have the credentials handy. Making your own version of the admin views can be helpful if you don't have access to the admin views in your organization and are responsible for Server Performance or would like to modify these views for your unique purpose.
In my particular case I retrieved this workbook from the following location on the G:\ drive from Tableau Server version 2019.3.5, which will be different based on drive installation location, server version number, and Tableau component process number. It looks a little something like this:<Root Tableau Server Drive>\Tableau Server\data\tabsvc\temp\vizqlserver_<Process#>.<VersionNumber> \tabbed admin views<RandomNumber>twb. Make a copy of the file, stored in the workbooks folder, changing the name to "tabbed admin views.twb" to start with a new version of the workbook all your own.
To this workbook I wanted to go one step deeper than the 'tabbed admin views' or even what is mentioned on the Tableau web site for General Performance Guidelines by adding additional information for RAM and CPU usage by Tableau Server Process, and Disk performance. This is possible using Windows Performance Monitor (PerfMon) in order to understand server hardware characteristics. The setup of PerfMon is outlined by Tableau here and can be used with the sample workbook provided on the Tableau website here. Also here is a review of Performance Monitor counters used to measure performance of virtual Windows environments.
Using these links as a starting point, I add here a few tips that can make the setup easier.
- Configuring individual counters in Performance Monitor is tedious as each item has to be carefully selected from the list. Consider creating a 'User Defined Template", export the template to XML, Open the XML and add the following sections for <Counter> and corresponding <CounterDisplayName>, then reimport. This will save the time required to individually select each item. Note that Tableau Server process related items will likely need to be modified to match the number of processes for your particular Server configuration. Instructions for working with Windows Performance Monitor Templates is here and here is the XML portion of the template for copy-paste and time savings.
Windows Performance Monitor Template XML
2. Selection of wildcard "*" related Performance Counters will provide all members of that category even if new items are added. In this case notice that "Processor Information(*)\%Processor Utility" so that if/when additional cores are added to the server they appeared in the output file for Tableau to read, following the refreshing of the extracted data source.
3. Ensure that the 'Overwrite' option is not selected in Properties in order to have a new log file written out each time the counter is restarted (usually after modification). I suggest writing out all files then using the wildcard union capability in Tableau to view all unioned csv log files.
4. When access the csv files from Tableau it is possible to receive a cryptic error that looks like the following when the data source is refreshed. In my case the fix for this was to ensure that the logged in user had access to the output directory by manually opening the folder. This can happen if the logged in user is different than the authorized user defined in PerfMon.
5. If you use the template workbook provided by Tableau, the Dimension values will have to be updated (using Replace References...) to point to the values in your logging file. The Dimensions have the format \\<Machine Name>\<Log Value> so the <Machine Name> portion of the Dimension name must be updated with your machine name. (pictured above in the screen shot for #2)
6. By default these entries show up in Tableau as String "Abc" Dimensions however they should be converted to Measures with default format of data type "Number (decimal)" (pictured above in the screen shot for #2).
How to Know if You Need Tableau Performance Tuning
CPU by Process
When looking at CPU performance for a Tableau server environment the key is to appropriately manage server CPU utilization spikes. Here is a look at maximum CPU utilization per hour by the VM (top) with stacked view showing max CPU utilization by Tableau Process. As a benchmark utilization above 80% is dangerous.
Broken out by individual process here is what this looks like. Note here that the axis scales are intentionally not the same so that the smaller processes are visible.
The views show that VizQL (Purple) is the major consumer with minimal Backgrounder (Blue) processes occasionally spiking for non essential maintenance workbooks, scheduled to refresh during non peak times. Some management and cleanup was applied here to reduce Backgrounder utilization. Also at play is that the view starts with 8 CPU cores and ends with 16 cores which explains the reduction in % utilization across the view. While a good improvement, in our scenario we started with 16 CPU cores (not shown in the view), so even though more CPU headroom exists, perceived net end user performance was about the same, so this is 'not the droid we are looking for' and we continue looking.
This view is showing Disk Queue on C drive only, where Tableau was initially installed. Disk Queue is the number of processes that are waiting in queue for access to the disk for read/write operations. Any queue above 2 (above the red reference line added to the base workbook) is considered high and not performant.
Two observations are visible here in that Disk Queue view with Tableau Server installed on the C drive. The C: drive is not keeping up with disk read/write operations which can cause the OS to momentary hang which also effects Tableau Server performance.
Installing Tableau Server in the Cloud - AWS and Azure
As of the writing of this article, Tableau Server Requirements mention disk size, cores, and ram for Minimum Production Requirements with links to configuration documents the following Virtual Machine (VM) Whitepapers. Reading further into these sources reveals that disk speed is also a significant performance consideration with recommendations provided. Concerning disk type, listed here are the sections of interest from the Whitepapers for both Azure and AWS environments with additional links for VM and storage pricing:
Whitepaper: Next Generation Cloud BI: Tableau Server hosted on Microsoft Azure
"D16s_v3 provides the equivalent of 8 CPUs (16 vCPUs), 64 GiB memory and 128 GiB SSD storage. If you are looking to trial Tableau Server or want a baseline hardware recommendation to use for testing your usage patterns this is a great size to start with."
"For production installations it is recommended to install Tableau Server on a separate drive of type Premium SSD disk type of at least 128 GB (P10 size)"
"Typically, you should choose an instance type and size in accordance with the minimum recommendations (8 cores and 64GiB memory) for deploying Tableau Server on Amazon EC2. Currently, EC2 instances like the m4.4xlarge and r4.4xlarge meet our criteria for RAM and CPU. Either is a good starting point for deployment."
"Amazon Elastic Block Store (Amazon EBS) delivers persistent block-level storage volumes for use with EC2 instances in the AWS cloud. We recommend deploying your EC2 instance with at least two volumes: • A 30 – 50 GiB volume for the operating system • A 100 GiB+ volume for Tableau Server You should leverage Amazon EBS General Purpose SSD (GP2) volumes. Over the long term, we have generally experienced below-average-to-poor performance using magnetic disks and therefore recommend you avoid them."
Also worth mention here are AWS specifications from Tableau and associated article discussing optimal server hardware for AWS.
Two changes were made to drive setup:
1. In Tableau Server we talk about distributing expensive processes across a cluster and in this context separation of Tableau Server from the OS drive is the same principle. A reinstall of Tableau Server onto a second G:\ drive was performed to remove the single OS hosted drive disk queue contention.
2. Change the drive type to an SSD class drive for better performance to keep up with disk queue.
This view shows some initial set up spikes on C:\ however it then stabilizes on the OS partition C:\ with the load going to G:\ the Tableau Server drive.
With the increase in CPU Cores we also doubled the RAM on the node to 8 x 16 (Cores) = 128GB. The utilization wasn't bad previous, however with the increased core number we also doubled the number of VizQL processes which results in additional ram utilization. Here the increased utilization is visible, but appears lower as percentage on the top graph because the amount of total ram available was increased.
Even though this article lists changes to CPU, RAM, and Disk our environment began and ended with the same CPU cores (we are working with a core license) and about the same amount of RAM with the only difference here being the consolidation of a second node to simplify the architecture; in the future we may move to a three node cluster once this becomes a more mission critical solution. With this consideration, the change producing the largest end user impact was the disk type and configuration which allowed the server to function more responsively both when working on the server remotely via VPN & RDP and a significant end user performance improvement when viewing tableau dashboards, causing our customer advocate to write:
"It would appear between the fixes applied to the server and new resources the view load times have been cut in half. Looking at the Traffic to Views, it appears we have about the same traffic today as a week ago."
Here the performance change is visible on Nov 20 (red arrow) where we adjusted our server drive type to premium SSD and OS/Tableau drive configuration. A second improvement (orange arrow) was also visible when the original 16 core configuration was restored with the addition of 8 additional cores to the base 8 core configuration.
The conclusion here is simple enough to both see an understand in the data, however the journey to arrive at clear endpoints like this that improve performance requires the coordinated effort of multiple disciplines, applied knowledge, and experience. At Zuar our consultants regularly assist customers with Tableau installation, Windows and Linux upgrades, admin best practice training, and server performance optimization exercises. Contact Us if you would like assistance optimizing the value of your Tableau investment working with a Tableau, AWS, or Snowflake certified consultant.