Elasticsearch¶

The Elasticsearch data provider allows ChartFactor to interact with Elasticsearch through its HTTP API. Internally, it takes advantage of the Elasticsearch JavaScript API, elasticsearch.js, to perform all aggregation requests.

The Elasticsearch data provider supports Elasticsearch versions 6.x and 7.x.

First, we include the library in the main html of our app:

 <script src="./CFT-elasticsearch-provider.min.js"></script>

The Provider JSON object requires the url parameter in addition to name and provider parameters. Example:

var providers = [{
    name:'ElasticSearch',
    provider:'elasticsearch',
    url:'https://chartfactor.com:9200'
}]

The configuration for Elasticsearch also allows to specify headers in case they are required, for example for authentication purposes:

var providers = [{
    ...
    headers: { 'Authorization': authToken }
}]

Then, use the setProviders() method of ChartFactor to set your data provider definitions. Example:

cf.setProviders(providers);

Top-n and Bottom-n limitations¶

The Elasticsearch provider supports Top-n and Bottom-n queries with the following limitations:

Up to 2 group-by attributes. Example:

   // define metrics
    var metric = cf.Metric('commission', 'sum');
    var metrics = [metric];

    // define attributes to group by
    var group = cf.Attribute('venuestate')
                        .limit(10)
                        .sort('desc', metric);
    var group2 = cf.Attribute('venuecity.keyword')
                        .limit(10)
                        .sort('desc', metric);

    // set up the chart
    var myChart = cf.provider('Elastic')
                    .source('ticket_sales')
                    .groupby(group, group2)
                    .metrics(...metrics)
                    .element('query_1')
                    .set('legend','right')

    // execute it
    myChart.execute();

Up to 1 group-by attribute and 1 column group attribute for intersection queries

// Define metrics
let metric0 = cf.Metric("fare", "sum");
// Define attributes to group by
let group1 = cf.Attribute("company")
    .limit(10)
    .sort("desc", cf.Metric("fare", "sum"));
let colGroup1 = cf.Attribute("trip_start_timestamp")
    .func("MONTH")
    .limit(10)
    .sort("asc", "trip_start_timestamp");
// set up provider, source, metrics, group, and column group
let myChart = cf.provider("Elasticsearch")
    .source("chicago_taxi_trips")
    .groupby(group1).colgroupby(colGroup1)
    .metrics(metric0)
    .element('query_1')
    .execute();

No pagination is currently supported for Top-n and Bottom-n queries

Raw Data limitations¶

By default, Elasticsearch limits the maximum value of from + size searches used by raw-level queries (ie: Raw Data Table) when users scroll down through the results. This is an Elasticsearch dynamic per-index setting and the default value is 10000. Please refer to the Elasticsearch documentation to update this setting.

Please note that users normally narrow down the data before scanning it at the raw-level and therefore, modifying this Elasticsearch setting should be rarely needed.

Wildcard Source Name Support¶

The Elasticsearch data provider supports wildcard source names. For example, in a logging use case, a typical index name is made of a string prefix and the date in YYYY.MM.DD format. In this situation, you can create visualizations using a source that includes all indexes for every day in May using a pattern like logstash-2015.05*. Example:

// define providers
var providers = [{
    name:'ElasticSearch',
    provider:'elasticsearch',
    url:'https://chartfactor.com:9200'
}]

cf.setProviders(providers);

// obtain datas source
var ticketSales = cf.provider('ElasticSearch').source('logstash-2015.05*');

Custom routing¶

The Elasticsearch data engine uses a default routing mechanism to identify the shard where to store and search for a specific document.

Elasticsearch also supports Custom Routing. When you are using custom routing to index your documents, you should also configure your ChartFactor Elasticsearch data provider to make optimal use of routing information when performing aggregations and retrieving raw data. This is done through the use of Custom Metadata by adding the routing property to the specific source (e.g. index). An example is presented below.

let customMetadata = {
    [source]: {
        routing: routingObj,
        fields: {
            'created_at': {
                label: 'Created at'
            },
            "airline": {
                label: "Airline"
            }
        }
    }
};

The important part is the routing property. The routing property must receive an object containing the path and function properties. An example of a routing object is below.

const routingObj = {
    path: 'created_at',
    function: (obj) => {
        let fromDate = new Date(obj.created_at[0]);
        let toDate = new Date(obj.created_at[1]);

        let months = (fromDate.getMonth() + 1) + ',' + (toDate.getMonth() + 1);
        // Remove duplicates
        let array = [...new Set(months.split(','))];

        return array.join(',');
    }
};

path must be set with a single field name or an array of field names. During query execution, the value(s) of filter(s) matching this field(s) will be passed to the function defined in the function property. Example of valid path entries are path: 'created_at' and path: ['created_at', 'airline'].
function must be set with a function with a single object parameter. During execution, this function receives an object containing the filter values of the fields specified in the path property. Here you provide the logic that returns the routing string, most likely derived from the filter values.

The example above asumes that documents were previously indexed using the created_at month as the routing value. Therefore, the function in the example returns the month numbers found in the created_at property, which contains two dates, from and to. If both dates have the same month, the logic removes duplicates to avoid returning "5,5" and instead returning the "5" string.

Note

The function is invoked only when the query or visualization includes at least one filter for the fields specified in the path property. Otherwise, it is not invoked since the routing function depends on these filter values to obtain the routing result.

Case insensitive searches¶

Elasticsearch case insensitive searches are supported for fields mapped both as text and keyword. This is also the default string mapping in Elasticsearch. Example string mapping for the field venuename that supports case insensitiveness:

"venuename": {
    "type": "text",
    "fields": {
        "keyword": {
        "type": "keyword",
        "null_value": "null",
        "ignore_above": 256
        }
    }
}

Supported Aggregations Out-Of-The-Box¶

SUM¶

    var metric = cf.Metric("amount","sum");

AVG¶

    var metric = cf.Metric("amount","avg");

MIN¶

    var metric = cf.Metric("amount","min");

MAX¶

    var metric = cf.Metric("amount","max");

COUNT DISTINCT¶

    var metric = cf.Metric("my_attribute","unique");

PERCENTILES¶

    var metric = cf.Metric('commission', 'percentiles');

GEOHASH¶

Please refer to Geo Queries for more information.

Dependencies¶

elasticsearch.js 14.0.0