# Query

You can use Zuar Runner's Query plugin to create a relational database table from any SQL query against any relational database.

.. image:: assets/query-1.png
   :alt: Relational Database

The source database and destination database can be Zuar Runner's built in
[PostgreSQL](/databases/postgresql/) database or any other
remote database (e.g. [Snowflake](/databases/snowflake/),
[Amazon Redshift](/databases/amazon-redshift/), [Microsoft
SQL Server](/databases/microsoft-sql-server/), etc). 

This is the preferred way to mirror data from other databases.

## Caveats

Zuar Runner needs network access to the source and destination
databases. This typically means whitelisting IP addresses on
potentially both sides (remote database server and Zuar Runner).

## Create a Query Job

### Start Creating the Job:

Click "+ Add Job".

.. image:: assets/query__add_job.png
   :alt: Add job

Select "Query".

.. image:: assets/query__icon.png

### Write the Database Query:

You will see a screen like this:

.. image:: assets/query__connect.png
   :alt: Query Connect Screen

**Database URL**

Learn more about [Database URLs](/databases/database-urls/) here.

**Encrypting Username and Password**

Learn how to store and [encrypt your database username and
password](/databases/database-urls/) to be used in a query
string.

**Query**

Type the query you want to create a table from in the Query area.

It is a good practice to fully qualify your tables by including the
schema of the table.

For example, if you want to mirror an entire table:
```sql
SELECT *
FROM <schema>.<table>;
```
    
### Specify the Output

Hit Next and specify the job title. Edit the schema and table you want
to use at the output destination for the query results.

.. image:: assets/query__output.png
   :alt: Query Output Screen

You can enable **Upsert** by checking the Upsert box.

Learn more about upsert at the bottom of this document.

Hit Save.

Go run the job!

## Upsert

Any Zuar Runner Query job can be configured for [upsert](/jobs/io/upsert/). 

In addition to the standard additional steps for required for upsert,
the job config's **input** requires one additional parameter,
`last_modified_column`.

Here's an example of specifying the `last_modified_column`:

```json
    "input": {
            "dbo": "...",
            "query": [
                "..."
            ],
            "last_modified_column": "updated_at",
            "use": "query.io#QueryInput"
        },
```

The input's query has a column named `updated_at` that serves as the `last_modified_column`.

## Reflection

The default behavior in Zuar Runner is to learn the structure of data as it
is piped. This is useful for data coming from APIs and flat files
where there may not be a schema or data types.

This process is handled by the `ExtraColumnsTransform` and `ColumnsTransform``transforms` of the `#Input` **step**:

```json
    "steps": [
    	{
        	"transforms": [
            	{
            	    "use": "mitto.iov2.transform#ExtraColumnsTransform"
            	},
            	{
            	    "use": "mitto.iov2.transform#ColumnsTransform"
            	}
        	],
        	"use": "mitto.iov2.steps#Input"
    	},
        ...
    ]
```

However, when using databases as a source, as in Query jobs, Zuar Runner can
leverage the source database table's schema. This is referred to as
**reflection**. The result is that the output table can exactly mirror
the input table with column names and data types.

**NOTE: Reflection typically only makes sense for a SELECT * query.**

To use reflect, in the **steps** of the job config, update the `transforms` array of the `#Input` **step** to this:

```json
     "steps": [
         {
            "transforms": [
                {
                    "use": "mitto.iov2.transform#ReflectTransform",
                    "dbo": "{driver}://{username}:{password}@{server}:{port}/{database}",
                    "schema": "{schema}",
                    "tablename": "{tablename}"
                }
            ],
            "use": "mitto.iov2.steps#Input"
        },
        ...
    ]
```

- **dbo** - This is the [database
  url](/databases/database-urls/) of the table you are
  reflecting. This should match the exact `dbo` from the job's `Input`. 
- **schema** - This is the schema of the input table you are reflecting.
- **tablename** - This is the table name of the input table you are reflecting.