Google BigQuery¶
Google BigQuery is an enterprise data warehouse that allows to store and query massive datasets by enabling super-fast SQL queries using the processing power of Google's infrastructure.
Setup¶
Before using the power of Google BigQuery within ChartFactor, you need to set up a Google Cloud account and a project within it. You will also need to create a OAuth client ID for that project in order to authenticate in your cloud account using Oauth2.
To connecto to Google BigQuery from your browser, include in your HTML page the following:
- The google api client library
- The ChartFactor BigQuery provider library
Example below:
1 2 |
|
Then, declare your provider in your Javascript code as follows:
1 2 3 4 5 6 7 8 9 10 |
|
The description of the properties above is below:
name¶
This is the name you'd like to use to identify your provider. It is a standard property for all providers.
provider¶
This is the provider type, in this case google-bigquery
projectId¶
This is the project ID in your Google Cloud account
clientId¶
This is the OAuth 2.0 Client ID of your application when the BigQuery provider is running on the browser to enable OAuth authentication
scope¶
This property is optional. It is needed only if we have configured a special access to one of the Google services or APIs that is not included in the default scope provided by ChartFactor for the Google BigQuery provider. The scopes already included are:
1 2 3 |
|
file¶
This property is only applicable when deploying the BigQuery provider on the server-side in a Node.js environment. It is the location of the private key JSON file containing the security information necessary to establish a connection to BigQuery. An example of a private key JSON file is below:
1 2 3 4 5 6 7 8 9 10 11 12 |
|
otherProjects¶
This is a list of projects that you would also like to access, such as projects containing public datasets. This property is optional. By default the public project bigquery-public-data
will always be included.
Using your data provider¶
Now that you have your providers
array initialized (previous section) with your BigQuery provider, use the ChartFactor setProviders()
method to set your data provider definition:
1 |
|
Google BigQuery uses a hierarchical naming convention to identify its projects, datasets, and tables. You must use this same naming convention to specify table names in the aql:
1 2 3 4 |
|
project-id:dataset.table
Nested structures¶
Google BigQuery allows objects (nested and repeated data) to be stored by using fields of type RECORD. You can learn more about nested structures here.
ChartFactor supports the interaction with these structure by flattening them transparently so they can be used as any other field:
1 2 3 4 5 6 |
|
Response timeout¶
In some occasions, the Google BigQuery's engine may take time to process and execute queries, especially if the dataset is large and the queries have not been cached.
By default, when a request is sent by the BigQuery provider to the data engine, it will wait for a 60,000 milliseconds (a full minute) to receive the response. After that time it will throw a Timeout
error for the query. This value can be modified to a greater one in the custom metadata:
1 2 3 4 5 6 7 8 9 |
|
In the example above we are setting timeoutMs
with a waiting time of 2 mins for the response to come back. The value should always be set in milliseconds.
Supported Aggregations Out-Of-The-Box¶
SUM¶
1 |
|
AVG¶
1 |
|
MIN¶
1 |
|
MAX¶
1 |
|
COUNT DISTINCT¶
1 |
|
PERCENTILES¶
1 |
|
GEOHASH¶
Please refer to Geo Queries for more information.
Dependencies¶
- Google Api javascript client