Store Input¶

Once raw source data is stored in a Zuar Runner store, the Zuar Runner store itself can become a source (input) in an IO job.

The primary use case for this is to create several related relational database tables from a single nested data source (API data, JSON, etc).

Example Source Data¶

Let’s use this accounting JSON data as an example:

[
  {
    "amount": 500.00,
    "customer": {
      "id": 1,
      "name": "Customer 1"
    },
    "id": 1,
    "invoice_lines": [
      {
        "amount": 200.00,
        "id": 1,
        "product": "A"
      },
      {
        "amount": 300.00,
        "id": 2,
        "product": "B"
      }
    ],
    "invoice_number": 1001
  },
  {
    "amount": 240.00,
    "customer": {
      "id": 1,
      "name": "Customer 2"
    },
    "id": 2,
    "invoice_lines": [
      {
        "amount": 200.00,
        "id": 1,
        "product": "A"
      },
      {
        "amount": 40.00,
        "id": 2,
        "product": "C"
      }
    ],
    "invoice_number": 1002
  }
]

This dataset has two arrays [ ]:

The data set as a whole is an array
The invoice_lines data contains an array

So, Zuar Runner needs to split this data into two relational database tables.

Table 1 (invoice header)¶
id	invoice_number	amount	cusomer_id	customer_name
1	1001	500.00	1	Customer 1
2	1002	240.00	2	Customer 2

Table 1, can be created using the Zuar Runner JSON plugin.

Job #1 name: json_invoices

Table 2 (invoice lines)¶
invoice_id	id	line	product	amount
1001	1	1	A	200.00
1001	2	2	B	300.00
1002	3	1	A	200.00
1002	4	2	C	40.00

If Zuar Runner job #1 uses a Zuar Runner store, then Table 2, can be created using a Zuar Runner Store job.

Zuar Runner Store Job Config¶

Zuar Runner store jobs can be created using the Generic plugin.

Here’s an example job config that pipes data from a Zuar Runner store.

{
    "input": {
        "name": "json_invoices",
        "use": "mitto.iov2.input#StoreInput",
        "jpath": "$.invoice_lines[*]",
        "members": [
            {
                "name": "invoice_id",
                "value": "$.id"
            }
        ]
    },
    "output": {
        "tablename": "invoice_lines",
        "use": "call:mitto.iov2.db#todb",
        "schema": "json",
        "dbo": "postgresql://db/analytics"
    },
    "steps": [
        {
            "use": "mitto.iov2.steps#Input",
            "transforms": [
                {
                    "use": "mitto.iov2.transform#ExtraColumnsTransform"
                },
                {
                    "use": "mitto.iov2.transform#ColumnsTransform"
                }
            ]
        },
        {
            "use": "mitto.iov2.steps#CreateTable"
        },
        {
            "use": "mitto.iov2.steps#Output",
            "transforms": [
                {
                    "use": "mitto.iov2.transform#FlattenTransform"
                }
            ]
        },
        {
            "use": "mitto.iov2.steps#CollectMeta"
        }
    ]
}

The input section is the important part of this job config.

Name¶

The name parameter is the name of the “parent” job that is configured to use a store.

Jpath¶

The jpath parameter let’s you define exactly what section of the store’s JSON data to pull from. This is necessary for Table 2 because of the nested data in the invoice_lines array. Learn more about JSONPath (jpath) expressions below.

Query a record from the “parent” job’s Zuar Runner store API to determine the correct jpath.

Members¶

The members parameter let’s you add additional columns from any section of the store’s JSON data to the resulting database table. In our case, we need to add the id of each invoice to the invoice_lines table in order for us to join the two tables together. members also uses jpath in it’s value key, and you can name the resulting column by adjusting the name key.

JSONPath Expressions¶

To understand how to use JSONPath to pick specific sections of data out of a JSON object you can visit: https://goessner.net/articles/JsonPath/

To interactively create a JSON path, you can paste JSON into this GUI and select fields until you obtain your desired level of detail: https://jsonpathfinder.com/ . This tool will help build you path.

To interactively learn or test how to use JSONPath syntax you can visit: https://jsonpath.com/

Best Practices¶

Always run the store job after the “parent” job because the store job requires the store from the “parent” job.