Mitto Job: Subscriptions

Introduction

A subscriptions table can exist on any host. Each row in the table defines a single subscription to a report that will be created by performing a Tableau Export. Reports (also referred to as exports) can be created in CSV, PDF, or XLSX formats. Among other things, a subscription is defined by:

  • The name of a Tableau “view” that specifies the source of the data in the report

  • Optional JSON data consisting of key/value pairs that can be used to filter the data used to create the export

  • The frequency at which the subscription will be processed (daily, weekly, or monthly)

  • An optional day of the week on which to process the subscription

  • The time of day to process the subscription

  • A timezone associated with the time of day

  • One or more email addresses that will receive the report

Each time the Subscription Job runs on a Mitto instance, it records the time at which the job was started. It then inspects each record in the subscriptions table to determine if the subscription should have run between the time of the last run and the current time. If the subscription was defined to run during that interval, a Tableau export preformed to create the report and the is emailed to the recipients. Otherwise, the subscription is ignored.

Important

The Subscriptions Job processes all subscriptions defined in the subscriptions table – only one instance of a Subscriptions Job should be active on a Mitto instance.

Tip

Because subscriptions can be created with an hourly frequency, the Subscriptions Job should usually be scheduled to run a minimum of once an hour. If it is desired to have report creation and emailing that closely matches the time specified in the subscriptions table, the Subscriptions Job should be scheduled to run once a minute.

Reports

A report may contain either a single view or all the views in a workbook. PDF reports can be extensively customized for scaling, sizing, etc. CSV and XLSX reports can not be customized.

PDF Reports

PDF reports are created using Tableau’s Vizql API. If a report contains multiple views, all views are assembled into a single PDF file.

CSV Reports

CSV reports are created using Tableau’s REST API. If a report contains a single view, the report that is created is a single CSV file. If a report contains multiple views, a ZIP archive is created. Archives contain one CSV file per view. File names within the archive are a combination of the workbook and view name with a CSV extension.

XLSX Reports

XLSX reports (Excel workbooks) are created using Tableau’s REST API. The workbook contains one worksheet per view. Sheet names within the workbook are a combination of the workbook and view name.

Job Configuration

Reference: Subscriptions job configuration

A Subscription Job’s configuration comprises three sections: subscriptions, mail, and tableau. The subscriptions and mail sections work the same, no matter the type of report being exported. The tableau section section contains Tableau server credentials as well as defaults that are used for export jobs. Values provided by subscriptions always override default values that may be present in the tableau section.

We’ll begin by discussing the sections. The discussion is not exhaustive; rather it is intended to provide an overall feel for how the Subscriptions job works, in general. Detailed descriptions of each configurable parameter can be found in the job’s schema documentation. The following Subscription job configuration will be used as an example.

{
    "subscriptions": {
        /* Defines the location of 'subscriptions' table */
        "subscriptions_input": {
            /* Optional section controls relative dates */
            "relative_dates": {
                "start_datetime": "global_start_datetime",
                "end_datetime": "global_end_datetime",
                "datetime_format": "%Y-%m-%d %H:%M:%S.%f"
            },
            "dbo": "postgresql://localhost/mitto",
            "schema_name": "public",
            "table_name": "subscriptions"
        },
        /* Defines the location of 'subscriptions_status' table */
        "subscription_status_input": {
            "dbo": "postgresql://localhost/mitto",
            "schema_name": "public",
            "table_name": "subscription_status"
        }
    },
    /* Parameters for sending exports via email */
    "mail": {
        /* Defines the SMTP server to send mail with */
        "server": "smtp.zuar.com",
        "port": 25,
        "require_tls": true,
        /* SMTP server credentials (named credentials supported) */
        "credentials": {
            "username": "operations@zuar.com",
            "password": "<password>"
        },
        /* Defines the email used to send exported reports */
        "html": null,
        "mail_bcc": null,
        "mail_cc": null,
        "mail_from": "operations@zuar.com",
        "mail_subject": "default_mail_subject",
        "text": "email body text",
        "attachments": []
    },
    /* Defines Tableau server related parameters */
    "tableau": {
        /* Tableau server credentials (named credentials are supported) */
        "credentials": {
            "password": "<password>",
            "username": "<username>"
        },
        /* Tableau server configuration */
        "server": {
            "server": "https://tableau.zuar.com",
            "site": "",
            "user_id_to_impersonate": "5cf3542a-0c46-42fb-b6a5-27a96e41054a"
        },
        /* default parameters to be used in creating CSV export s */
        "export_csv": {
            "name": "subscription_content",
            "type": "csv"
        },
        /* default parameters to be used in creating XLSX export s */
        "export_xlsx": {
            "name": "subscription_content",
            "type": "xlsx"
        },
        /* default parameters to be used in creating PDF export s */
        "export_pdf": {
            "log_responses": false,
            "modifications": [
                {
                    "action": "update",
                    "destination": "files",
                    "items": {
                        "clientDimension": "{\"w\": 1280, \"h\": 1024}",
                        "dashboardPortSize": "{\"w\": 1280, \"h\": 1024}",
                        "worksheetPortSize": "{\"w\": 1280, \"h\": 1024}"
                    },
                    "request": "bootstrap"
                }
            ],
            "name": "subscription_content",
            "options": {
                "imageHeight": "0",
                "imageWidth": "0",
                "pageFitHorizontal": "1",
                "pageFitVertical": "1",
                "pageOrientationOption": "landscape",
                "pageScaleMode": "auto",
                "pageScalePercent": "100",
                "pageSizeOption": "letter"
            },
            "log_requests": false,
            "save_responses": true,
            "type": "pdf"
        }
    }
}

"subscriptions" Section

"subscriptions_input" Section

This defines the location of the subscriptions table. The subscriptions table must have the following columns:

  • id - A UUID that identifies the subscription.

  • user_id - A UUID belonging to a user on the Tableau server. When the export is run, the user associated with this UUID is impersonated.

  • email - One or more comma-separated email addresses to which the report will be emailed.

  • view_name - The view name to export. view_name should be in the format <workbook_curl>/<view_curl>, where <workbook_curl> is the _content_url of a workbook on the server and <view_curl> is the _content_url of a view in the workbook.

  • schedule - Defines how often the subscription is to run. Values are: DAILY, WEEKLY, and MONTHLY.

  • day_of - Defines the day on which the subscription is to run.

  • send_at - Defines the time at which the subscription is to run (format: HH:MM:SS).

  • tz - Is the timezone associated with send_at. This must be a timezone tz database format. E.g.: America/Chicago.

  • full - If true, all sheets in the Tableau workbook are included in the export. <view_curl> is ignored when full is true.

  • json_data - One or more Tableau “view filters” to apply to the view when creating the export. Relative Dates are supported in json_data. See the following section.

  • type - The type of export to create: pdf, csv, and xlsx.

  • orientation - The page orientation for PDF exports: landscape and portrait. The value is ignored for CSV or XLSX exports.

The Subscriptions job only reads data from this table; it never modifies it.

The following DDL will create a properly matched subscriptions table:

create table subscriptions
(
    id uuid not null
        constraint subscriptions_pkey
            primary key,
    user_id uuid,
    email varchar(256) not null,
    view_name varchar(256) not null,
    schedule varchar(8) not null,
    day_of integer not null,
    send_at time not null,
    tz varchar(256) not null,
    "full" boolean not null,
    json_data text not null,
    updated_at timestamp with time zone not null,
    created_at timestamp with time zone not null,
    type varchar(8) default 'pdf'::character varying,
    orientation varchar(16) default 'landscape'::character varying
);

alter table subscriptions owner to mitto;
Relative Dates

subscriptions.json_data may contain a date_range key and value. The value of the key is a string containing a relative date specification. The date_range key and value are converted into view filters specifying a start and end date. For example, if a subscription runs on 2020-01-12 and json_data contains:

"date_range": "YESTERDAY"

the view filters created from json_data will contain:

"global_start_datetime": "2010-01-11 00:00:00.000000",
"global_end_datetime": "2020-01-11 23:59:59.999999",

global_start_date and global_end_date are the default view filter keys used. These defaults can be overridden via the relative_dates section of the job configuration.

The following values can be used with date_range:

date_range values
THIS_HOUR
THIS_DAY
THIS_WEEK
THIS_MONTH
THIS_QUARTER
THIS_HALF
THIS_YEAR
NEXT_HOUR
NEXT_DAY
NEXT_WEEK
NEXT_MONTH
NEXT_QUARTER
NEXT_HALF
NEXT_YEAR
LAST_HOUR
LAST_DAY
LAST_WEEK
LAST_MONTH
LAST_QUARTER
LAST_HALF
LAST_YEAR
TODAY
TOMORROW
YESTERDAY
Modifying Relative Date Behavior

As described in the previous section global_start_datetime and global_end_datetime are the default keys and the default value format contains microseconds. These can be controlled with the optional relative_dates section of the job configuration.

For example, the following section:

"relative_dates": {
    "start_datetime": "begin",
    "end_datetime": "end",
    "datetime_format": "%Y-%m-%d %H:%M:%S.%F"
},

the view filters created from json_data will contain:

"begin": "2010-01-11 00:00:00.000",
"end": "2020-01-11 23:59:59.999",

The datetime_format string is a Python strftime format string that is used to format a Python datetime.datetime object. Note that the %F format code in the above example is not a standard Python strftime code; it is a special code implemented by Mitto to support Tableau’s use of microseconds instead of milliseconds for some datetime values. If the %F code is present it must be the two final characters of the datetime_format string.

References:

"subscriptions_status_input" Section

This defines the location of the subscriptions_status table. The subscriptions_status is used to record the results of runs of the Subscriptions job.

"mail" Section

The mail sections controls how email is sent using SMTP and its content.

The following parameters in the mail section, if present, will override the value in the associated column of the subscriptions table:

export parameter subscriptions column
mail_to email

tableau Section

The tableau section provides extensive control over the requests that are used to create the exported reports. It is identical to and serves the same purpose as the Export job configuration. Refer to the Export job for more information.

Tableau Credentials

Tableau credentials can be embedded in the job configuration via:

"credentials": {
    "password": "<password>",
    "username": "<username>"
},

Alternatively, existing named credentials can be referenced:

"credentials": "zuar-tableau-creds",

The credentials used can be either those of a Tableau Server admin or those of a non-admin user.

If the credentials are those of a Tableau Server administrator:

  • User-id impersonation can be used, if desired

  • The user-id to impersonate can be provided either via the user_id column of the subscriptions table or via tableau.server.user_id_to_impersonate in the job configuration.

  • If the user_id column of the subscriptions table and tableau.server.user_id_to_impersonate in the job configuration are both <null> / None, user-id impersonation is not performed.

If the credentials are not those of a Tableau Server administrator:

  • All values in the user_id column of the subscriptions table and tableau.server.user_id_to_impersonate in the job configuration must be <null> / None.

Tableau Export

It is likely that most users will accept the default values for the export session. If present in the job configuration, the values provided will override the defaults. The following parameters in the export section, if present, will override values in the associated column of the subscriptions table:

export parameter subscriptions column
full full
view_filters json_data
workbook workbook_name
view view_name
type type

If present in the export section, workbook and view will be combined to create a replacement for the view_name value in the subscriptions table.

Minimal Configuration

The initial job configuration shown earlier contains the builtin default values for many parameters for illustrative purposes. A smaller, simpler job configuration that uses defaults for most parameters follows:

{
    "mail": {
        "server": "smtp.zuar.com",
        "port": 25,
        "require_tls":  true,
        "credentials": "zuar-smtp-creds",
        "mail_from": "operations@zuar.com",
        "text": "email body text",
        "attachments": []
    },
    "subscriptions": {
        "subscriptions_input": {
            "dbo": "postgresql://localhost/analytics",
            "schema_name": "public",
            "table_name": "subscriptions"
        },
        "subscription_status_input": {
            "dbo": "postgresql://localhost/mitto",
            "schema_name": "public",
            "table_name": "subscription_status"
        }
    },
    "tableau": {
        "credentials": "zuar-tableau-creds",
        "server": {
            "server": "https://tableau.zuar.com",
            "site": ""
        }
    }
}

Testing / Debugging

  • As an aid to testing and debugging, a Subscription Job can be run from the command-line in the following manner (--current-time is optional):

    python3 job_subscription.py           \
        --current-time 'YYYY-MM-DD HH:MM' \
        /var/mitto/conf/example.json`
    

    This runs the Subscription Job setting the current time to the time provided.

  • A --noupdate option is available. When present, it prevents the subscription_status table from being updated with the status of a run.

Warning

Use of –current-time updates the last checked time for all subscriptions. If this is used on a production system, the last_checked column in the subscription_status table should be reset to its previous value. This can be done manually or by re-running job_subscription.py –current-time option using the previous value.

  • The mail.text parameter supports expansion of the {debug} template. This template will be replaced by the contents of the subscription that created the email and the export it contains as well as the actual export configuration used to create the export.

.