导数聚合编辑

一种父管道聚合,用于计算父直方图(或 date_histogram)聚合中指定指标的导数。指定的指标必须是数值型的,并且封闭直方图必须将 min_doc_count 设置为 0histogram 聚合的默认值)。

语法编辑

一个 derivative 聚合单独来看是这样的

"derivative": {
  "buckets_path": "the_sum"
}

表 59. derivative 参数

参数名称 描述 必需 默认值

buckets_path

我们要为其查找导数的桶的路径(有关更多详细信息,请参阅 buckets_path 语法

必需

gap_policy

在数据中发现缺口时要应用的策略(有关更多详细信息,请参阅 处理数据中的缺口

可选

skip

format

输出值的 DecimalFormat 模式。如果指定,则格式化的值将在聚合的 value_as_string 属性中返回

可选

null

一阶导数编辑

以下代码段计算每月总 sales 的导数

response = client.search(
  index: 'sales',
  body: {
    size: 0,
    aggregations: {
      sales_per_month: {
        date_histogram: {
          field: 'date',
          calendar_interval: 'month'
        },
        aggregations: {
          sales: {
            sum: {
              field: 'price'
            }
          },
          sales_deriv: {
            derivative: {
              buckets_path: 'sales'
            }
          }
        }
      }
    }
  }
)
puts response
POST /sales/_search
{
  "size": 0,
  "aggs": {
    "sales_per_month": {
      "date_histogram": {
        "field": "date",
        "calendar_interval": "month"
      },
      "aggs": {
        "sales": {
          "sum": {
            "field": "price"
          }
        },
        "sales_deriv": {
          "derivative": {
            "buckets_path": "sales" 
          }
        }
      }
    }
  }
}

buckets_path 指示此导数聚合使用 sales 聚合的输出进行导数计算

以下可能是响应

{
   "took": 11,
   "timed_out": false,
   "_shards": ...,
   "hits": ...,
   "aggregations": {
      "sales_per_month": {
         "buckets": [
            {
               "key_as_string": "2015/01/01 00:00:00",
               "key": 1420070400000,
               "doc_count": 3,
               "sales": {
                  "value": 550.0
               } 
            },
            {
               "key_as_string": "2015/02/01 00:00:00",
               "key": 1422748800000,
               "doc_count": 2,
               "sales": {
                  "value": 60.0
               },
               "sales_deriv": {
                  "value": -490.0 
               }
            },
            {
               "key_as_string": "2015/03/01 00:00:00",
               "key": 1425168000000,
               "doc_count": 2, 
               "sales": {
                  "value": 375.0
               },
               "sales_deriv": {
                  "value": 315.0
               }
            }
         ]
      }
   }
}

第一个桶没有导数,因为我们需要至少 2 个数据点才能计算导数

导数值单位由 sales 聚合和父直方图隐式定义,因此在这种情况下,假设 price 字段的单位为 $,则单位为 $/月。

桶中文档的数量由 doc_count 表示

二阶导数编辑

可以通过将导数管道聚合链接到另一个导数管道聚合的结果来计算二阶导数,如下面的示例所示,该示例将计算每月总销售额的一阶和二阶导数

response = client.search(
  index: 'sales',
  body: {
    size: 0,
    aggregations: {
      sales_per_month: {
        date_histogram: {
          field: 'date',
          calendar_interval: 'month'
        },
        aggregations: {
          sales: {
            sum: {
              field: 'price'
            }
          },
          sales_deriv: {
            derivative: {
              buckets_path: 'sales'
            }
          },
          "sales_2nd_deriv": {
            derivative: {
              buckets_path: 'sales_deriv'
            }
          }
        }
      }
    }
  }
)
puts response
POST /sales/_search
{
  "size": 0,
  "aggs": {
    "sales_per_month": {
      "date_histogram": {
        "field": "date",
        "calendar_interval": "month"
      },
      "aggs": {
        "sales": {
          "sum": {
            "field": "price"
          }
        },
        "sales_deriv": {
          "derivative": {
            "buckets_path": "sales"
          }
        },
        "sales_2nd_deriv": {
          "derivative": {
            "buckets_path": "sales_deriv" 
          }
        }
      }
    }
  }
}

二阶导数的 buckets_path 指向一阶导数的名称

以下可能是响应

{
   "took": 50,
   "timed_out": false,
   "_shards": ...,
   "hits": ...,
   "aggregations": {
      "sales_per_month": {
         "buckets": [
            {
               "key_as_string": "2015/01/01 00:00:00",
               "key": 1420070400000,
               "doc_count": 3,
               "sales": {
                  "value": 550.0
               } 
            },
            {
               "key_as_string": "2015/02/01 00:00:00",
               "key": 1422748800000,
               "doc_count": 2,
               "sales": {
                  "value": 60.0
               },
               "sales_deriv": {
                  "value": -490.0
               } 
            },
            {
               "key_as_string": "2015/03/01 00:00:00",
               "key": 1425168000000,
               "doc_count": 2,
               "sales": {
                  "value": 375.0
               },
               "sales_deriv": {
                  "value": 315.0
               },
               "sales_2nd_deriv": {
                  "value": 805.0
               }
            }
         ]
      }
   }
}

前两个桶没有二阶导数,因为我们需要来自一阶导数的至少 2 个数据点才能计算二阶导数

单位编辑

导数聚合允许指定导数值的单位。这将在响应 normalized_value 中返回一个额外的字段,该字段以所需的 x 轴单位报告导数值。在下面的示例中,我们计算每月总销售额的导数,但要求以每天销售额的单位计算销售额的导数

response = client.search(
  index: 'sales',
  body: {
    size: 0,
    aggregations: {
      sales_per_month: {
        date_histogram: {
          field: 'date',
          calendar_interval: 'month'
        },
        aggregations: {
          sales: {
            sum: {
              field: 'price'
            }
          },
          sales_deriv: {
            derivative: {
              buckets_path: 'sales',
              unit: 'day'
            }
          }
        }
      }
    }
  }
)
puts response
POST /sales/_search
{
  "size": 0,
  "aggs": {
    "sales_per_month": {
      "date_histogram": {
        "field": "date",
        "calendar_interval": "month"
      },
      "aggs": {
        "sales": {
          "sum": {
            "field": "price"
          }
        },
        "sales_deriv": {
          "derivative": {
            "buckets_path": "sales",
            "unit": "day" 
          }
        }
      }
    }
  }
}

unit 指定用于导数计算的 x 轴的单位

以下可能是响应

{
   "took": 50,
   "timed_out": false,
   "_shards": ...,
   "hits": ...,
   "aggregations": {
      "sales_per_month": {
         "buckets": [
            {
               "key_as_string": "2015/01/01 00:00:00",
               "key": 1420070400000,
               "doc_count": 3,
               "sales": {
                  "value": 550.0
               } 
            },
            {
               "key_as_string": "2015/02/01 00:00:00",
               "key": 1422748800000,
               "doc_count": 2,
               "sales": {
                  "value": 60.0
               },
               "sales_deriv": {
                  "value": -490.0, 
                  "normalized_value": -15.806451612903226 
               }
            },
            {
               "key_as_string": "2015/03/01 00:00:00",
               "key": 1425168000000,
               "doc_count": 2,
               "sales": {
                  "value": 375.0
               },
               "sales_deriv": {
                  "value": 315.0,
                  "normalized_value": 11.25
               }
            }
         ]
      }
   }
}

value 以原始单位“每月”报告

normalized_value 以所需的单位“每天”报告