Export to Flat File¶
Zuar Runner can output data to delimited (e.g. comma, tab, pipe, etc) flat files.
Outputting to a flat file with a custom IO job¶
Here’s an example SQL database table we will export to CSV using a modified Query job.
Sample data for toCSV
Use the Generic plugin to create a custom IO job.
Set the type
to io
.
Use the job config below:
{
"input": {
"dbo": "postgresql://db/analytics",
"query": [
"SELECT * from excel.tennis_products;"
],
"use": "query.io#QueryInput"
},
"output": {
"delimiter": ",",
"path": "/var/mitto/data/output_{year}_{month}_{day}.csv",
"use": "call:mitto.iov2#tocsv"
},
"steps": [
{
"transforms": [
{
"rename_columns": false,
"use": "mitto.iov2.transform#ExtraColumnsTransform",
"include_empty_columns": true
}
],
"use": "mitto.iov2.steps#Input"
},
{
"transforms": [
{
"use": "mitto.iov2.transform#FlattenTransform"
}
],
"use": "mitto.iov2.steps#Output"
}
]
}
Job Config Keys¶
The input
below is used for a Query job, but the
key components when exporting to a flat file are the output
and the
steps
. The input can be an input from any IO job.
input¶
Contains the database and query: SELECT * from excel.tennis_products;
. Any
SELECT
statement will work. Zuar Runner will use the column names for headers.
output¶
Contains the output path, and the delimiter to use. /var/mitto/data/
is the
public directory. Files in this directory will be visible in the Files
page
(in the menu on the left of your Zuar Runner UI).
In this example we’re using ,
as a delimiter. For tabs use: \t
The output code uses the Python csv
package. Any key word arguments can be
passed to csv.writer
(https://docs.python.org/3/library/csv.html#csv.writer).
Path Variables¶
It is possible to include date and time variables in your output
path/filenames. In our example
/var/mitto/data/output_{year}_*{month}_*{day}.csv
, the strings inside
curly brackets {}
are variables; in this case the current year, month and
day.
Available Variables:
year - The current year (2020)
month - The current month (05)
day - The current day (26)
hour - The current hour (07)
start_time - Date and time the job started (05-26-2020T07:30:30)
last_year - 2019
last_month - 04
last_day - 25
last_hour - 06
next_year - 2021
next_month - 06
next_day - 27
next_hour - 08
week - The week number for week of the year. (01-52)
You can also pull out values from the last and next values:
Yesterday’s day is
{last_day.day}
, month is{last_day.month}
, and year is{last_day.year}
.
These are more complex examples using Python’s **strftime** on date values:
Today is
{now:%A}, {now:%B} {now:%d}, {now:%Y}
One year ago from today was
{last_year:%c}
The time is
{now:%X}
The time is
{now:%I}:{minute}:{second} {now:%p}
steps¶
Any steps for processing the data. The steps included in the example above are
default steps which will include empty columns (via "include_empty_columns":
true
) and will not rename columns (via "rename_columns": false
).
Resulting Flat File¶
The resulting flat file is created on Zuar Runner’s file system.
Output from toCSV job
You can download the file manually or use another job (e.g. email, Rclone, command line, etc) to send the file elsewhere automatically.