File Download (cURL)¶
Some situations require Zuar Runner to download one or more files from a remote server as part of a workflow. This is helpful when a system doesn’t expose a direct database connection or API for Zuar Runner ingest the data. The files downloaded could be comma separated data (.CSV files), a spreadsheet (.XLSX files) or any other web accessible files. After the files are downloaded, they will be available on the Zuar Runner file system for processing in subsequent Zuar Runner jobs.
Zuar Runner provides the cURL (“command line tool and library for transferring data with URLs”) Job type for these file download operations.
Additional reference documentation for the cURL utility is beyond the scope of this document, but can be viewed here.
File Download Job Creation Wizard¶
Create a job using the “File Download” plugin from the Zuar Runner UI:
Click Add > Job.
Click the “File Download” icon to start the wizard.
In the next screen, the user is prompted for the details of the cURL Job to create, including the file URL, and have the opportunity to adjust Advanced options for the cURL Job through command line arguments.
Finally, the wizard will prompt the user for a Title for the job being created.
cURL Job Setup¶
URL to Download¶
In this input, enter the full URL of the file you wish to download.
cURL Arguments¶
In this input, the user can enter non-default options that are passed into the cURL file download command. The default options are -s -b /tmp/cookies -L -O -f
The user can add to these arguments as necessary. The default options configure the file download Job to:
- -s
operate silently, not showing progress during the download
- -b FILE
store the HTTP Cookie in the file system path specified by
FILE
- -L
enables cURL to follow download location changes controlled by HTTP 3xx redirect response codes
- -O
use the same name for the saved file as the remote file name
- -f
fail silently, don’t show output on download failure
Explanation of cURL Arguments¶
Full documentation of the available cURL command line arguments are available at: https://curl.se/docs/manpage.html
Downloaded file location¶
The files download by the cURL Job are stored by default in Zuar Runner’s file manager.
File naming¶
By default, Zuar Runner uses the -O
flag which will create a file in Zuar Runner
using the file name of the source file. If you would like to rename
the file, remove the -O
flag and add -o {new name of file}
.
For example, let’s say you are downloading a file named
daily.csv
. The Zuar Runner default would download the file and preserve
the source filename. However, removing the -O
flag and adding -o
daily_metrics.csv
would download the file and store the output file
as daily_metrics.csv
.
Going further with this example, if you wish to maintain a copy of
each daily download file rather than over-writing, it would be
effective to add a datetime to each filename. We can do this by
specifying -o
argument and setting the output filename to something
like daily_$(date +'%Y-%m-%d:%H:%M:%S').csv
to use bash shll
parameter expansion
to populate a dynamic filename. The resulting
filename will be similar to daily_2023-03-09:16:55:13.csv
.
Example Zuar Runner cURL Job Configuration¶
After the wizard is completed, you can view the Zuar Runner job that was saved. In the JSON configuration listed, you will see something similar to this example, with the URL listed followed by the arguments:
{
url: https://www.example_site.com/path/daily.csv
args: [
-s
-b
/tmp/cookies
-L
-o daily_$(date +'%Y-%m-%d:%H:%M:%S').csv
-f
]
}