Once raw source data is stored in a Mitto store, the Mitto store itself can become a source (input) in an IO job.

The primary use case for this is to create several related relational database tables from a single nested data source (API data, JSON, etc).
Example Source Data
Let's use this accounting JSON data as an example:
[
{
"amount": 500.00,
"customer": {
"id": 1,
"name": "Customer 1"
},
"id": 1,
"invoice_lines": [
{
"amount": 200.00,
"id": 1,
"product": "A"
},
{
"amount": 300.00,
"id": 2,
"product": "B"
}
],
"invoice_number": 1001
},
{
"amount": 240.00,
"customer": {
"id": 1,
"name": "Customer 2"
},
"id": 2,
"invoice_lines": [
{
"amount": 200.00,
"id": 1,
"product": "A"
},
{
"amount": 40.00,
"id": 2,
"product": "C"
}
],
"invoice_number": 1002
}
]
This dataset has two arrays [ ]
:
- The data set as a whole is an array
- The
invoice_lines
data contains an array
So, Mitto needs to split this data into two relational database tables.
Table #1 (invoice header)
id | invoice_number | amount | cusomer_id | customer_name |
---|---|---|---|---|
1 | 1001 | 500.00 | 1 | Customer 1 |
2 | 1002 | 240.00 | 2 | Customer 2 |
Table 1, can be created using the Mitto JSON plugin.
Job #1 name: json_invoices
Table #2 (invoice lines)
invoice_id | id | line | product | amount |
---|---|---|---|---|
1001 | 1 | 1 | A | 200.00 |
1001 | 2 | 2 | B | 300.00 |
1002 | 3 | 1 | A | 200.00 |
1002 | 4 | 2 | C | 40.00 |
If Mitto job #1 uses a Mitto store, then Table 2, can be created using a Mitto Store job.
Mitto Store Job Config
Mitto store jobs can be created using the Generic plugin.
Here's an example job config that pipes data from a Mitto store.
{
"input": {
"name": "json_invoices",
"use": "mitto.iov2.input#StoreInput",
"jpath": "$.invoice_lines[*]",
"members": [
{
"name": "invoice_id",
"value": "$.id"
}
]
},
"output": {
"tablename": "invoice_lines",
"use": "call:mitto.iov2.db#todb",
"schema": "json",
"dbo": "postgresql://localhost/analytics"
},
"steps": [
{
"use": "mitto.iov2.steps#Input",
"transforms": [
{
"use": "mitto.iov2.transform#ExtraColumnsTransform"
},
{
"use": "mitto.iov2.transform#ColumnsTransform"
}
]
},
{
"use": "mitto.iov2.steps#CreateTable"
},
{
"use": "mitto.iov2.steps#Output",
"transforms": [
{
"use": "mitto.iov2.transform#FlattenTransform"
}
]
},
{
"use": "mitto.iov2.steps#CollectMeta"
}
]
}
The input
section is the important part of this job config.
Name
The name
parameter is the name of the "parent" job that is configured to use a store.
Jpath
The jpath
parameter let's you define exactly what section of the store's JSON data to pull from. This is necessary for Table 2 because of the nested data in the invoice_lines
array. Learn more about JSONPath (jpath) expressions below.
Query a record from the "parent" job's Mitto store API to determine the correct jpath
.
Members
The members
parameter let's you add additional columns from any section of the store's JSON data to the resulting database table. In our case, we need to add the id
of each invoice to the invoice_lines
table in order for us to join the two tables together. members
also uses jpath in it's value
key, and you can name the resulting column by adjusting the name
key.
JSONPath Expressions
To understand how to use JSONPath to pick specific sections of data out of a JSON object you can visit: https://goessner.net/articles/JsonPath/
To interactively learn how to use JSONPath syntax you can visit: https://jsonpath.com/
Best Practices
Always run the store job after the "parent" job because the store job requires the store from the "parent" job.