Store Input

Once raw source data is stored in a Zuar Runner store, the Zuar Runner store itself can become a source (input) in an IO job.

Connect to Zuar Runner Store

The primary use case for this is to create several related relational database tables from a single nested data source (API data, JSON, etc).

Example Source Data

Let’s use this accounting JSON data as an example:

[
  {
    "amount": 500.00,
    "customer": {
      "id": 1,
      "name": "Customer 1"
    },
    "id": 1,
    "invoice_lines": [
      {
        "amount": 200.00,
        "id": 1,
        "product": "A"
      },
      {
        "amount": 300.00,
        "id": 2,
        "product": "B"
      }
    ],
    "invoice_number": 1001
  },
  {
    "amount": 240.00,
    "customer": {
      "id": 1,
      "name": "Customer 2"
    },
    "id": 2,
    "invoice_lines": [
      {
        "amount": 200.00,
        "id": 1,
        "product": "A"
      },
      {
        "amount": 40.00,
        "id": 2,
        "product": "C"
      }
    ],
    "invoice_number": 1002
  }
]

This dataset has two arrays [ ]:

  1. The data set as a whole is an array

  2. The invoice_lines data contains an array

So, Zuar Runner needs to split this data into two relational database tables.

Table 1 (invoice header)

id

invoice_number

amount

cusomer_id

customer_name

1

1001

500.00

1

Customer 1

2

1002

240.00

2

Customer 2

Table 1, can be created using the Zuar Runner JSON plugin.

Job #1 name: json_invoices

Table 2 (invoice lines)

invoice_id

id

line

product

amount

1001

1

1

A

200.00

1001

2

2

B

300.00

1002

3

1

A

200.00

1002

4

2

C

40.00

If Zuar Runner job #1 uses a Zuar Runner store, then Table 2, can be created using a Zuar Runner Store job.

Zuar Runner Store Job Config

Zuar Runner store jobs can be created using the Generic plugin.

Here’s an example job config that pipes data from a Zuar Runner store.

{
    "input": {
        "name": "json_invoices",
        "use": "mitto.iov2.input#StoreInput",
        "jpath": "$.invoice_lines[*]",
        "members": [
            {
                "name": "invoice_id",
                "value": "$.id"
            }
        ]
    },
    "output": {
        "tablename": "invoice_lines",
        "use": "call:mitto.iov2.db#todb",
        "schema": "json",
        "dbo": "postgresql://db/analytics"
    },
    "steps": [
        {
            "use": "mitto.iov2.steps#Input",
            "transforms": [
                {
                    "use": "mitto.iov2.transform#ExtraColumnsTransform"
                },
                {
                    "use": "mitto.iov2.transform#ColumnsTransform"
                }
            ]
        },
        {
            "use": "mitto.iov2.steps#CreateTable"
        },
        {
            "use": "mitto.iov2.steps#Output",
            "transforms": [
                {
                    "use": "mitto.iov2.transform#FlattenTransform"
                }
            ]
        },
        {
            "use": "mitto.iov2.steps#CollectMeta"
        }
    ]
}

The input section is the important part of this job config.

Name

The name parameter is the name of the “parent” job that is configured to use a store.

Jpath

The jpath parameter let’s you define exactly what section of the store’s JSON data to pull from. This is necessary for Table 2 because of the nested data in the invoice_lines array. Learn more about JSONPath (jpath) expressions below.

Query a record from the “parent” job’s Zuar Runner store API to determine the correct jpath.

Members

The members parameter let’s you add additional columns from any section of the store’s JSON data to the resulting database table. In our case, we need to add the id of each invoice to the invoice_lines table in order for us to join the two tables together. members also uses jpath in it’s value key, and you can name the resulting column by adjusting the name key.

JSONPath Expressions

To understand how to use JSONPath to pick specific sections of data out of a JSON object you can visit: https://goessner.net/articles/JsonPath/

To interactively create a JSON path, you can paste JSON into this GUI and select fields until you obtain your desired level of detail: https://jsonpathfinder.com/. This tool will help build you path.

To interactively learn or test how to use JSONPath syntax you can visit: https://jsonpath.com/

Best Practices

Always run the store job after the “parent” job because the store job requires the store from the “parent” job.