# Google Analytics ## Overview The Google Analytics connector enables Zuar Runner to pipe data from Google Analytics and store the data in a database. .. image:: assets/google-analytics-1.png :alt: Zuar Runner Google Analytics data flow In order to create a Google Analytics job in Zuar Runner, you'll need two pieces of information from your organization's Google Analytics account: 1. A `Service Account JSON credentials file` - see the first section of this document ("Initial Google Analytics Setup for Zuar Runner") for instructions to enable the API, create a Service Account, and download this file. 2. The `View ID` for the specific View which will provide data to Zuar Runner ## Initial Google Analytics Setup for Zuar Runner You'll only need to complete these steps once, unless you setup multiple Service Accounts. .. NOTE:: These steps require Google Analytics administrator access. Login with the correct administrator account before performing any of the following. ### Enable the Google Analytics API 1. While logged in as an administrator of your organization's Google Analytics account, browse to Google Cloud's [APIs & Services page](https://console.developers.google.com/apis/library/analyticsreporting.googleapis.com). .. image:: assets/google-analytics_enable_api.png :alt: Enabling GA Reporting API 2. Make sure you have the correct Project selected in the header drop-down. 3. Click **Enable**. The page will be updated to display Google Cloud's **API & Services** page, indicating that API access has been enabled for your Google Analytics account. ### Create a Service Account .. NOTE:: A service account's credentials include a unique, generated email address and at least one public/private key pair. If *domain-wide delegation* is enabled, a client ID is also part of the service account's credentials. .. |sa-accts| image:: assets/google-analytics_iam_sa_accts.png :alt: Google IAM & Admin's Service Accounts page .. |sa-name| image:: assets/google-analytics-4.png :alt: Service Account details screen 1. Go to Google IAM & Admin's [Service Accounts page](https://console.developers.google.com/iam-admin/serviceaccounts). |sa-accts| 2. Select the appropriate project from the header drop-down. 2. Click **+ CREATE SERVICE ACCOUNT**. 3. Enter a name and description for the Service Account, for example: |sa-name| 4. Select the `Viewer` role under **Quick Access** > **Basic**. Select **Continue**. .. image:: assets/google-analytics_select_role.png :alt: Select Service Account role 5. Click **Done** to create the Service Account. 6. Click on the newly created Service Account's email address to edit its details. 7. Click the **KEYS** tab. Select **Create new key** from the **ADD KEY** drop-down. .. image:: assets/google-analytics_sa_keys.png :alt: Service Account key creation 8. Ensure **JSON** is selected and click **CREATE**. The JSON file will be downloaded. Store this file securely as you'll need it when creating Google Analytics jobs in Zuar Runner. **This is the only time you'll be able to access this file!** 9. Copy the `client_email` email address from the JSON file. This is needed later to add this service account to Google Analytics. Reference: [Google Cloud Service Account Documentation](https://developers.google.com/identity/protocols/OAuth2ServiceAccount#creatinganaccount) ### Add the Service Account User to the Google Analytics Account You can add users at the account, property, or view level. The level at which you add a user determines that user's initial access. For example, if you add a user at the account level, then that user also has access to all the properties and views in the account, with the same set of permissions. If you add a user at the view level, then the user has access to only that view with the permissions you provide. You can change the level of access and permissions for a user at any time. Source: [Google Analytics User Management Documentation](https://support.google.com/analytics/answer/1009702) Unless your organization's security needs prevent it, we recommend attaching the Service Account at the **account** level. Your Service Account will then have access to all properties and views on the account. Otherwise, you'll need to repeat the steps below for each property or view you wish to access via Zuar Runner's Service Account. 1. Login to [Google Analytics](https://analytics.google.com/analytics) 2. Choose **Admin** in the left sidebar. If the sidebar is collapsed, you'll only see a gear icon. 3. Select the correct **Account** from the drop-down. 4. Click **Account Access Management** in the **Account** column. (If you are attaching the Service Account to a lower level, you'd select the property and/or view and then the **Access Management** link at the desired level.) 5. Click the blue **+** icon in the upper-right and choose **Add new user**. 6. Enter the email address created for the Service Account (`client_email` from the JSON file) 7. Choose the **Analyst** Role. 8. Click **Add** in the upper-right. ## Create a Google Analytics Job in Zuar Runner Once you've setup your Google Analytics account and obtained the JSON credentials files (see the section above, "Initial Google Analytics Setup for Zuar Runner"), you're ready to add your first Zuar Runner job to access data. 1. Click **+Add Job** in Zuar Runner's left sidebar. 2. Choose **Google Analytics**. .. image:: assets/google-analytics-7.png :alt: Zuar Runner's Google Analytics wizard 3. The Google Analytics connector uses a JSON credentials file from a Service Account for authentication. See the "Create a Service Account" section above for more information. Click the box labeled **Choose the Json file or drag it here** to select your JSON file from the file browser, OR drag and drop the JSON file into the box on the Zuar Runner webpage. .. image:: assets/google-analytics_acct_creds.png :alt: Add GA Service Account credentials JSON file 4. Click **Next**. 5. If the JSON file is valid, you'll see a confirmation message. Otherwise click the link to try again. .. image:: assets/google-analytics-9.png :alt: GA Service Account JSON file confirmation 6. Click **Next**. 7. Specify the View ID and click **Next**. See the **Find a View ID** section below for more information about the View ID. .. image:: assets/google-analytics_view_id.png :alt: Specify View ID 8. Click **Next**. The Zuar Runner Google Analytics connector allows the user to pick a combination of metrics and dimensions that will create a report from the Google Analytics API. Here's a third-party tool that can help with this process: [Google Analytics Dimensions and Metrics Explorer](https://ga-dev-tools.web.app/dimensions-metrics-explorer/) 9. Select up to 10 desired **metrics** and click **Next**. .. image:: assets/google-analytics_select_metrics.png :alt: Select metrics 10. Select up to 8 **dimensions** and click **Next**. .. image:: assets/google-analytics_select_dims.png :alt: Select dimensions 11. Specify where where Zuar Runner should store the data. .. image:: assets/google-analytics_output.png :alt: Specify the output details - **Title** - The title of the Zuar Runner job created by the wizard - Pre-populated based on metrics and dimensions selected but can be modified. - **Type** - **Local Database** (recommended): Zuar Runner's internal PostgreSQL database - **Redshift**: AWS-hosted database - **Custom**: Specify the connection string of the destination database. - **Schema** - The schema is a database organization mechanism which groups database objects (tables, views, etc.) and can be used for security purposes. - Default schema: `ga` - See the **Best Practices** section below for tips on naming schemes. - **Table** - The name of the database table that Zuar Runner will use to store the Google Analytics data retreived by this job. - Pre-populated based on metrics and dimensions selected but can be modified. 12. Click **Save**. ## Find a View ID A *view* is your access point for reports; a defined view of data from a property. You give users access to a view so they can see the reports based on that view's data. A property can contain one or more views. Source: [Google Analytics Hierarchy Documentation](https://support.google.com/analytics/answer/1009618) .. NOTE:: In 2023, Google Analytics introduced a new type of property called **GA4** which is a replacement for the older **Universal Analytics (UA)** properties. Currently, Zuar Runner jobs require Universal Analytics (UA) properties. ### Google Analytics Account with Universal Analytics (UA) Properties If your account is using the new GA4 properties, see the GA4 section bleow. 1. Log in to [Google Analytics](https://analytics.google.com/analytics). 2. Choose **Admin** in the left sidebar. If the sidebar is collapsed, you'll only see a gear icon. 3. Select the correct **Account**, **Property**, and **View** combination from the drop-downs. 4. Click **View Settings** in the **View** column. **View ID** will be listed under **Basic Settings**. Save this value for use when creating a Zuar Runner job. ### Google Analytics Account using GA4 Properties If your account is using the new GA4 properties, check your account to see if the older UA properties still exist. (UA properties are prefixed with `UA-`.) If so, use the UA property and proceed with step 3 above. If you don't have a UA property, you'll need to create one for use with Zuar Runner, following these steps: 1. In [Google Analytics](https://analytics.google.com/analytics), choose **Admin** in the left sidebar. If the sidebar is collapsed, you'll only see a gear icon. 2. Select the correct **Account** from the drop-down. 3. Click **+ Create Property**. 4. Click the **Show Advanced Properties** button. 5. Enable the switch control next to **Create a Universal Analytics property**. 6. Select the radio button for **Create a Universal Analytics property only**. 7. To connect the Serivce Account, proceed with **step 3** in the UA section above. ## Google Analytics and Zuar Runner Best Practices ### Test Metrics and Dimensions Combinations Before Creating Your Zuar Runner Jobs Use a third party tool ([Google Analytics Dimensions and Metrics Explorer](https://ga-dev-tools.web.app/dimensions-metrics-explorer/)) to determine the combinations of dimensions and metrics that work for your use case before you go through the Zuar Runner plugin wizard process. ### Naming Convention for Zuar Runner Jobs and Database Schemas Typically users pull data from more than one View ID in Google Analytics, so naming convention on jobs and output schemas/tables is important. The default database schema is `ga`, but we recommend using the following schemas naming format if you are pulling from multiple Google Analytics views: - `ga_name_of_view_id_1` - `ga_name_of_view_id_2` If you separate Google Analytics views into different schemas, but keep the table names the same (for the same combination of dimensions and metrics), you can easily start to analyze data across multiple view IDs or Google Analytics accounts. ### Data Normalization Google APIs format time-based metrics differently than what you may be familiar with in the Google Analytics web application. Additional transformations may be required based on the desired time format. Google defaults the following time-based metrics as milliseconds or seconds: .. image:: assets/google-analytics_time_based_metrics.png :alt: Alt text ## Anatomy of a Zuar Runner Google Analytics Job When editing a Zuar Runner Google Analytics job, the following attributes are available. In this section we explain attributes you might want to update or add to your job configuration. Some of these are not accessible through Zuar Runner's Google Analytics wizard. The JSON for these job types will look similar to this (only the `input` section is shown here): ```json input: { use: ga.io#GaInput credentials: { type: service_account project_id: private_key_id: private_key: client_email: client_id: auth_uri: https://accounts.google.com/o/oauth2/auth token_uri: https://oauth2.googleapis.com/token auth_provider_x509_cert_url: https://www.googleapis.com/oauth2/v1/certs client_x509_cert_url: } view_id: start_date: 2021-01-01 end_date: chunk_size_in_days: 90 metrics: [ ga:users ga:newUsers ga:percentNewSessions ] dimensions: [ ga:date ga:continent ga:country ga:region ga:city ga:latitude ga:longitude ] } ``` ### view_id The `view_id` attribute determines which Google Analytics view the Zuar Runner job will access data from. If the id was entered incorrectly into Zuar Runner's Google Analytics wizard, it can be updated here. - **Required**: Yes - **Format**: Match the format given in Google Analtyics, e.g. 123987 - See the section above for information on retreiving the View ID from Google Analytics. ### start_date The `start_date` attribute determines the date the Zuar Runner job will retrieve data starting from. - **Required**: No - **Format**: `YYYY-MM-DD` - **Default**: 2005-01-01 (January 1, 2005) - **Recommended setting**: set appropriately for analytical use case; most scenarios do not require data going back to 2005 ### end_date The `end_date` attribute determines the date the Zuar Runner job will retrieve data up to. - **Required**: No - **Format**: `YYYY-MM-DD` - **Default**: None (get latest available data) - **Recommended setting**: set appropriately for analytical use case; no need to set if Zuar Runner should get the latest data each time the job executes ### chunk_size_in_days The `chunk_size_in_days` attribute determines the number of days worth of data to be included in each "chunk" returned by Google Analytics. While this is not a required attribute, we recommend setting it to `90` to limit the number of API calls Zuar Runner makes and avoid hitting the Google Analytics API rate limit. Additionally, if a large dataset is requested this setting will disable the automatic sampling Google Analytics would normally use to return results, ensuring data accuracy. - **Required**: No - **Format**: number of days as an integer, e.g. `90` - **Default**: None - **Recommended setting**: `90` ### metrics The `metrics` attribute is populated by the Google Analytics metrics you chose during the initial setup via Zuar Runner's wizard. If you need to modify them after initial job creation, you can do so via this attribute. See the section above on metrics and dimensions for more information on finding and testing metrics and dimensions. - **Required**: Yes - **Format**: List each metric on a new line between brackets, for example: ```json metrics: [ ga:users ga:newUsers ga:percentNewSessions ] ``` - **Default**: None - **Recommended setting**: set appropriately for analytical use case ### dimensions The `dimensions` attribute is populated by the Google Analytics metrics you chose during the initial setup via Zuar Runner's wizard. If you need to modify them after initial job creation, you can do so via this attribute. See the section above on metrics and dimensions for more information on finding and testing metrics and dimensions. - **Required**: No - **Format**: List each dimension on a new line between brackets, for example: ```json dimensions: [ ga:date ga:continent ga:country ga:region ga:city ga:latitude ga:longitude ] ``` - **Default**: None - **Recommended setting**: set appropriately for analytical use case