How to calculate difference between metrics in different aggregations in elasticsearch

elasticsearch aggregation
elasticsearch sub aggregation
elasticsearch multiple aggregations
custom aggregation elasticsearch
kibana aggregate by field
elasticsearch aggregation sort
serial differencing aggregation
elasticsearch aggregation multiple fields

I want to calculate the difference of nested aggregations between two dates.

To be more concrete is it possible to calculate the difference between date_1.buckets.field_1.buckets.field_2.buckets.field_3.value - date_2.buckets.field_1.buckets.field_2.buckets.field_3.value given the below request/response. Is that possible with elasticsearch v.1.0.1?

The aggregation query request looks like this:

 {
  "query": {
    "filtered": {
      "query": {
        "match_all": {}
      },
      "filter": {
        "bool": {
          "must": [
            {
              "terms": {
                "date": [
                  "2014-08-18 00:00:00.0",
                  "2014-08-15 00:00:00.0"
                ]
              }
            }
          ]
        }
      }
    }
  },
  "aggs": {
    "date_1": {
      "filter": {
        "terms": {
          "date": [
            "2014-08-18 00:00:00.0"
          ]
        }
      },
      "aggs": {
        "my_agg_1": {
          "terms": {
            "field": "field_1",
            "size": 2147483647,
            "order": {
              "_term": "desc"
            }
          },
          "aggs": {
            "my_agg_2": {
              "terms": {
                "field": "field_2",
                "size": 2147483647,
                "order": {
                  "_term": "desc"
                }
              },
              "aggs": {
                "my_agg_3": {
                  "sum": {
                    "field": "field_3"
                  }
                }
              }
            }
          }
        }
      }
    },
    "date_2": {
      "filter": {
        "terms": {
          "date": [
            "2014-08-15 00:00:00.0"
          ]
        }
      },
      "aggs": {
        "my_agg_1": {
          "terms": {
            "field": "field_1",
            "size": 2147483647,
            "order": {
              "_term": "desc"
            }
          },
          "aggs": {
            "my_agg_1": {
              "terms": {
                "field": "field_2",
                "size": 2147483647,
                "order": {
                  "_term": "desc"
                }
              },
              "aggs": {
                "my_agg_3": {
                  "sum": {
                    "field": "field_3"
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}

And the response looks like this:

{
  "took": 236,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "failed": 0
  },
  "hits": {
    "total": 1646,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "date_1": {
      "doc_count": 823,
      "field_1": {
        "buckets": [
          {
            "key": "field_1_key_1",
            "doc_count": 719,
            "field_2": {
              "buckets": [
                {
                  "key": "key_1",
                  "doc_count": 275,
                  "field_3": {
                    "value": 100
                  }
                }
              ]
            }
          }
        ]
      }
    },
    "date_2": {
      "doc_count": 823,
      "field_1": {
        "buckets": [
          {
            "key": "field_1_key_1",
            "doc_count": 719,
            "field_2": {
              "buckets": [
                {
                  "key": "key_1",
                  "doc_count": 275,
                  "field_3": {
                    "value": 80
                  }
                }
              ]
            }
          }
        ]
      }
    }
  }
}

Thank you.

No arithmetic operations are allowed between two aggregations' result from elasticsearch DSL, not even using scripts. (Upto version 1.1.1, at least I know)

Such operations need to be handeled in client side after processing the aggs result.

Reference

elasticsearch aggregation to sort by ratio of aggregations

Serial Differencing Aggregation, A sum metric is used to calculate the sum of a field. Serial differences are built by first specifying a histogram or date_histogram over a field. You can then  There are two types of these aggregations in Elasticsearch: single-value aggregations, which output a single value, and multi-value aggregations, which generate multiple metrics. In the first part of our metrics aggregations series, we'll discuss such single-value metrics aggregations as average and weighted average, min, max, and cardinality.

With elasticsearch new version (eg: 5.6.9) is possible:

{
  "size": 0,
    "query": {
    "constant_score": {
      "filter": {
        "bool": {
          "filter": [
            {
              "range": {
                "date_created": {
                  "gte": "2018-06-16T00:00:00+02:00",
                  "lte": "2018-06-16T23:59:59+02:00"
                }
              }
            }
          ]
        }
      }
    }
  },
  "aggs": {
    "by_millisec": {
      "range" : {
        "script" : {
          "lang": "painless",
            "source": "doc['date_delivered'][0] - doc['date_created'][0]"
        },
        "ranges" : [
          { "key": "<1sec", "to": 1000.0 },
          { "key": "1-5sec", "from": 1000.0, "to": 5000.0 },
          { "key": "5-30sec", "from": 5000.0, "to": 30000.0 },
          { "key": "30-60sec", "from": 30000.0, "to": 60000.0 },
          { "key": "1-2min", "from": 60000.0, "to": 120000.0 },
          { "key": "2-5min", "from": 120000.0, "to": 300000.0 },
          { "key": "5-10min", "from": 300000.0, "to": 600000.0 },
          { "key": ">10min", "from": 600000.0 }
        ]
      }
    }
  }
}

Calculation over aggregation results - Elasticsearch, Calculation over aggregation results · Elasticsearch "id" and for each distinct id figure out the difference between min and max value of "ts" field. /reference/1.5/​search-aggregations-metrics-scripted-metric-aggregation.html. The aggregations object (the key aggs can also be used) in the JSON holds the aggregations to be computed. Each aggregation is associated with a logical name that the user defines (e.g. if the aggregation computes the average price, then it would make sense to name it avg_price). These logical names will also be used to uniquely identify the aggregations in the response.

In 1.0.1 I couldn't find anything but in 1.4.2 you could try scripted_metric aggregation (still experimental).

Here are the scripted_metric documentation page

I am not good with the elasticsearch syntax but I think your metric inputs would be:

init_script- just initialize a accumulator for each date:

"init_script": "_agg.d1Val = 0; _agg.d2Val = 0;"

map_script- test the date of the document and add to the right accumulator:

"map_script": "if (doc.date == firstDate) { _agg.d1Val += doc.field_3; } else { _agg.d2Val = doc.field_3;};",

reduce_script - accumulate intermediate data from various shards and return the final results:

"reduce_script": "totalD1 = 0; totalD2 = 0; for (agg in _aggs) {  totalD1 += agg.d1Val ; totalD2 += agg.d2Val ;}; return totalD1 - totalD2"

I don't think that in this case you need a combine_script.

If course, if you can't use 1.4.2 than this is no help :-)

Metrics Aggregations | Elasticsearch Reference [7.6], The aggregations in this family compute metrics based on values extracted in one way or another from the documents that are being aggregated. The values are  There are different mechanisms by which terms aggregations can be executed: by using field values directly in order to aggregate data per-bucket ( map ) by using global ordinals of the field and allocating one bucket per global ordinal ( global_ordinals )

Comprehensive Guide to Elasticsearch Metrics Aggregations: Part I, There are two types of these aggregations in Elasticsearch: If we calculate a score average that does not take these differences into account,  There are single-value metrics aggregations, such as avg, and there are multi-value metrics aggregations such as stats. A simple example of a metrics aggregation is the value_count aggregation, which simply returns the total number of values that have been indexed for a given field.

Comprehensive Guide to Elasticsearch Metrics Aggregations: Part II, In this blog post, we are going to focus on such metrics aggregations as geo percentile ranks, and some other single-value and multi-value aggregations. you'll have a good understanding of metrics aggregations in Elasticsearch Let's calculate the geo bounds for all geo_point values of this field: Hi, i have 2 metrics that i calculate in Kibana: sum of disk usage sum of disk size What i would like to get is: (sum of disk usage) / (sum of disk size). This seem to be fairly basic requirement but i cant figure out how to do it in Kibana. Is there any way of doing so? Thanks, Ofer

How to perform Numeric Metric Aggregations with Elasticsearch , Metric aggregations can also be nested inside other bucket aggregations. The preceding query will calculate the sum of the downloadTotal field The only notable differences from the sum aggregation are as follows:. Aggregations: Ability to perform computations on aggregations … 02c0cdf Adds a new type of aggregation called 'reducers' which act on the output of aggregations and compute extra information that they add to the aggregation tree.

Comments
  • I am not sure if this is possible in newer version, but I had to handle it manually after response is received in es 1.1.0. elasticsearch aggregation to sort by ratio of aggregations
  • @PrayagUpd I might have to handle it on the client side. I am wondering if it is possible to do this in elasticsearch with the version I am running. It might be possible in the future with scripted metric aggregation (ES 1.4.0)
  • Maybe that may fix, but this feature itself is experimental for 1.4.0. I had to handle this manually in thousands of documents for 3/4 features in my analytics app. Hopefully it comes inbuilt in coming versions.
  • thanks for the answer. I had found scripted_metric as well, but cannot use that version and it is still experimental.