Overview

The JSON plugin creates a table in a database from any JSON file.

Example use case

Let's take a simple JSON file with nested data and convert the data into tables in a relational database.

Starting JSON file

Here's what the source JSON file (people_pets.json) looks like:

[
  {
    "id": 1,
    "name": "Justin",
    "pets": [
      {
        "id": 1,
        "name": "Bear",
        "pet": {
          "type": "dog",
          "breed": "Goldendoodle"
        }
      },
      {
        "id": 2,
        "name": "Birdie",
        "pet": {
          "type": "dog",
          "breed": "Goldendoodle"
        }
      }
    ]
  },
  {
    "id": 2,
    "name": "Matt",
    "pets": []
  },
  {
    "id": 3,
    "name": "Ben",
    "pets": [
      {
        "id": 3,
        "name": "Zuca",
        "pet": {
          "type": "dog",
          "breed": "Cavapoo"
        }
      }
    ]
  }
]

Expected Relational Database results

Because the JSON file has nested data in the pets array ([ ]), we will need to create two tables in the database.

  • Table 1 will include top level key/value pairs: id, name. The resulting table should have 3 rows.
  • Table 2 will include data in the pets array (id, name, and pet object's data). Table 2 will also include data from the parent object (id) so we can join the two database tables. The resulting table should have 3 rows.

Table 1 expected result:

id name
1 Justin
2 Matt
3 Ben

Table 2 expected result:

id name pet__type pet__breed person_id
1 Bear dog Goldendoodle 1
2 Birdie dog Goldendoodle 1
3 Zuca dog Cavapoo 3

Create a Mitto JSON job

The JSON file(s) must exist in Mitto's file system. Use the file manager to manually add a JSON file to Mitto.

The JSON plugin is a custom job type, so create a job using the Generic plugin.

Example Mitto job configurations

Table 1 Mitto job configuration:

{
    "input": {
        "use": "flatfile.iov2#JsonInput",
        "source": "/var/mitto/data/people_pets.json"
    },
    "output": {
        "tablename": "people",
        "use": "call:mitto.iov2.db#todb",
        "schema": "json",
        "dbo": "postgresql://localhost/analytics"
    },
    "steps": [
        {
            "use": "mitto.iov2.steps#Input",
            "transforms": [
                {
                    "use": "mitto.iov2.transform#ExtraColumnsTransform"
                },
                {
                    "use": "mitto.iov2.transform#ColumnsTransform"
                }
            ]
        },
        {
            "use": "mitto.iov2.steps#CreateTable"
        },
        {
            "use": "mitto.iov2.steps#Output",
            "transforms": [
                {
                    "use": "mitto.iov2.transform#FlattenTransform"
                }
            ]
        },
        {
            "use": "mitto.iov2.steps#CollectMeta"
        }
    ]
}

This example job take the source JSON file (people_pets.json) and creates a database table (json.people) in Mitto's internal PostgreSQL database (localhost). It creates columns from all the "top level" keys excluding the arrays ([ ]).

Table 2 Mitto job configuration:

{
    "input": {
        "source": "/var/mitto/data/people_pets.json",
        "use": "flatfile.iov2#JsonInput"
    },
    "output": {
        "dbo": "postgresql://localhost/analytics",
        "schema": "json",
        "tablename": "people__pets",
        "use": "call:mitto.iov2.db#todb"
    },
    "steps": [
        {
            "use": "mitto.iov2.steps#Input",
            "transforms": [
                {
                    "use": "mitto.iov2.transform#PluckV2Transform",
                    "jpath": "$.pets[*]",
                    "members": [
                        {
                            "name": "person_id",
                            "value": "$.id"
                        }
                    ]
                },
                {
                    "use": "mitto.iov2.transform#ExtraColumnsTransform"
                },
                {
                    "use": "mitto.iov2.transform#ColumnsTransform"
                }
            ]
        },
        {
            "use": "mitto.iov2.steps#CreateTable"
        },
        {
            "use": "mitto.iov2.steps#Output",
            "transforms": [
                {
                    "use": "mitto.iov2.transform#FlattenTransform"
                }
            ]
        },
        {
            "use": "mitto.iov2.steps#CollectMeta"
        }
    ]
}

This example job take the source JSON file (people_pets.json) and creates a database table (json.people__pets) in Mitto's internal PostgreSQL database (localhost). Based on the jpath and members parameters in the steps, it creates columns from all the keys inside the pets array and the "top level" id key.

Jpath

The jpath parameter let's you define exactly what section of the source JSON file to pull from. This is necessary for Table 2 because of the nested data in the pets array. Learn more about JSONPat expressions below.

Members

The members parameter let's you add additional columns from the source JSON file to the resulting database table. In our case, we need to add the id of each person to the people__pets table in order for us to join the two tables together. members also uses jpath in it's value key, and you can name the resulting column by adjusting the name key.

JSONPath Expressions

To understand how to use JSONPath to pick specific sections of data out of a JSON object you can visit: https://goessner.net/articles/JsonPath/

To interactively learn how to use JSONPath syntax you can visit: https://jsonpath.com/