Google BigQuery¶
Google BigQuery is an enterprise data warehouse that allows to store and query massive datasets by enabling super-fast SQL queries using the processing power of Google's infrastructure.
Setup¶
Before using the power of Google BigQuery within ChartFactor, you need to set up a Google Cloud account and a project within it. You will also need to create a OAuth client ID for that project in order to authenticate in your cloud account using Oauth2.
To connecto to Google BigQuery from your browser, include in your HTML page the following:
- The google api client library
- The ChartFactor BigQuery provider library
Example below:
1 2 |
|
Then, declare your provider in your Javascript code as follows:
1 2 3 4 5 6 7 8 9 10 |
|
The description of the properties above is below:
name¶
This is the name you'd like to use to identify your provider. It is a standard property for all providers.
provider¶
This is the provider type, in this case google-bigquery
projectId¶
This is the project ID in your Google Cloud account where your BigQuery data is stored. This is required to access your datasets and tables.
clientId¶
Defaults to ChartFactor’s Client ID if not provided. This is a unique application identifier assigned by Google Cloud. You should obtain your own Client ID when deploying your ChartFactor web apps on your own domain. To obtain your own Client ID, follow these steps:
- Go to the Google Cloud Console.
- Navigate to "APIs & Services" > "Credentials".
- Click on "Create Credentials" and select "OAuth client ID".
- Configure the consent screen if you haven't already (this is required for OAuth credentials).
- Choose the application type "Web Application" and provide the necessary details.
- Click "Create". The Client ID will be displayed on the screen and can also be found in the "Credentials" section under "OAuth 2.0 Client IDs".
The Client ID is applicable when running on the browser to enable the OAuth2 authentication. If you are deploying the BigQuery provider on the server side, you need to set a different property with your service account private key as specified below.
scope¶
This property is optional. It is needed only if your application requires special access to Google services or APIs that are not included in the default scope provided by ChartFactor for the Google BigQuery provider. The scopes already included are:
1 |
|
file¶
This property is only applicable when deploying the BigQuery provider on the server-side in a Node.js environment. It is the location of the private key JSON file containing the security information necessary to establish a connection to BigQuery. An example of a private key JSON file is below:
1 2 3 4 5 6 7 8 9 10 11 12 |
|
otherProjects¶
This is a list of projects that you would also like to access, such as projects containing public datasets. This property is optional. By default the public project bigquery-public-data
will always be included.
Using your data provider¶
Now that you have your providers
array initialized (previous section) with your BigQuery provider, use the ChartFactor setProviders()
method to set your data provider definition:
1 |
|
Google BigQuery uses a hierarchical naming convention to identify its projects, datasets, and tables. You must use this same naming convention to specify table names in the aql:
1 2 3 4 |
|
project-id:dataset.table
Nested structures¶
Google BigQuery allows objects (nested and repeated data) to be stored by using fields of type RECORD. You can learn more about nested structures here.
ChartFactor supports the interaction with these structure by flattening them transparently so they can be used as any other field:
1 2 3 4 5 6 |
|
Response timeout¶
In some occasions, the Google BigQuery's engine may take time to process and execute queries, especially if the dataset is large and the queries have not been cached.
By default, when a request is sent by the BigQuery provider to the data engine, it will wait for a 60,000 milliseconds (a full minute) to receive the response. After that time it will throw a Timeout
error for the query. This value can be modified to a greater one in the custom metadata:
1 2 3 4 5 6 7 8 9 |
|
In the example above we are setting timeoutMs
with a waiting time of 2 mins for the response to come back. The value should always be set in milliseconds.
Supported Aggregations Out-Of-The-Box¶
SUM¶
1 |
|
AVG¶
1 |
|
MIN¶
1 |
|
MAX¶
1 |
|
COUNT DISTINCT¶
1 |
|
PERCENTILES¶
1 |
|
GEOHASH¶
Please refer to Geo Queries for more information.
Dependencies¶
- Google Api javascript client