Skip to content

Google BigQuery

Google BigQuery is an enterprise data warehouse that allows to store and query massive datasets by enabling super-fast SQL queries using the processing power of Google's infrastructure.

Before using the power of Google BigQuery within ChartFactor, we need to set up a Google Cloud account and a project within it. We'll also need to create a client id for that project in order to authenticate in our cloud account using Oauth2.

To use the Google BigQuery provider we need to include the google api client and the provider:

1
2
<script src="https://apis.google.com/js/client.js"></script>
<script src="./CFT-google-bigquery-provider.min.js"></script>
1
2
3
4
5
6
7
8
9
// define providers
var providers = [{
    name:'Google BQ',
    provider:'google-bigquery',
    projectId: 'my-project-id',
    clientId: 'oauthClientId',
    otherProjects: [], //Optional
    scope: []  // Optional
}]

In the above configuration, we need to specify the project and client ids obtained from the Google Cloud account. The property otherProjects is a list of projects that we could also have access to like public datasets for example, and is optional. By default the public project bigquery-public-data will be always included.

The scope configuration is also optional. This is required only if we have configured a special access to one of the Google services or APIs that is not included in the default scope provided by ChartFactor for the Google BigQuery provider. The scopes already included are:

1
2
3
'https://www.googleapis.com/auth/bigquery',
'https://www.googleapis.com/auth/drive.readonly',
'https://www.googleapis.com/auth/cloud-platform.read-only'

Then, as in any other provider, use the setProviders() method of ChartFactor to set your data provider definitions:

1
cf.setProviders(providers);

Google BigQuery uses a hierarchical naming convention to identify its sources. We must use that same naming convention to specify source names in the aql:

1
2
3
4
cf.provider('Google BQ')
.source('bigquery-public-data:austin_crime.crime')
.groupby(...)
...
In general the format used is the following: project-id:dataset:table

Supported Aggregations Out-Of-The-Box

SUM

1
    var metric = cf.Metric("amount","sum");

AVG

1
    var metric = cf.Metric("amount","avg");

MIN

1
    var metric = cf.Metric("amount","min");

MAX

1
    var metric = cf.Metric("amount","max");

COUNT DISTINCT

1
    var metric = cf.Metric("my_attribute","unique");

Dependencies

  • Google Api javascript client