规范化聚合
编辑规范化聚合
编辑一个父管道聚合,用于计算特定桶值的特定规范化/重新缩放值。无法规范化的值将使用跳过间隙策略跳过。
语法
编辑一个 normalize
聚合单独来看如下所示
{ "normalize": { "buckets_path": "normalized", "method": "percent_of_sum" } }
表 77. normalize_pipeline
参数
参数名称 | 描述 | 必需 | 默认值 |
---|---|---|---|
|
我们要规范化的桶的路径(有关详细信息,请参阅 |
必需 |
|
|
要应用的特定方法 |
必需 |
|
|
输出值的DecimalFormat 模式。如果指定,格式化的值将在聚合的 |
可选 |
|
方法
编辑规范化聚合支持多种方法来转换桶值。每个方法定义将使用以下原始的桶值集作为示例:[5, 5, 10, 50, 10, 20]
。
- rescale_0_1
-
此方法重新缩放数据,使最小值变为零,最大值变为 1,其余值在两者之间线性规范化。
x' = (x - min_x) / (max_x - min_x)
[0, 0, .1111, 1, .1111, .3333]
- rescale_0_100
-
此方法重新缩放数据,使最小值变为零,最大值变为 100,其余值在两者之间线性规范化。
x' = 100 * (x - min_x) / (max_x - min_x)
[0, 0, 11.11, 100, 11.11, 33.33]
- percent_of_sum
-
此方法规范化每个值,使其表示其对总和的贡献百分比。
x' = x / sum_x
[5%, 5%, 10%, 50%, 10%, 20%]
- mean
-
此方法进行规范化,使得每个值都通过与平均值的差异进行规范化。
x' = (x - mean_x) / (max_x - min_x)
[4.63, 4.63, 9.63, 49.63, 9.63, 9.63, 19.63]
- z-score
-
此方法进行规范化,使得每个值都表示相对于标准差的均值有多远。
x' = (x - mean_x) / stdev_x
[-0.68, -0.68, -0.39, 1.94, -0.39, 0.19]
- softmax
-
此方法进行规范化,使得每个值都取指数,并相对于原始值的指数之和。
x' = e^x / sum_e_x
[2.862E-20, 2.862E-20, 4.248E-18, 0.999, 9.357E-14, 4.248E-18]
示例
编辑以下代码段计算每个月的总销售额的百分比
resp = client.search( index="sales", size=0, aggs={ "sales_per_month": { "date_histogram": { "field": "date", "calendar_interval": "month" }, "aggs": { "sales": { "sum": { "field": "price" } }, "percent_of_total_sales": { "normalize": { "buckets_path": "sales", "method": "percent_of_sum", "format": "00.00%" } } } } }, ) print(resp)
response = client.search( index: 'sales', body: { size: 0, aggregations: { sales_per_month: { date_histogram: { field: 'date', calendar_interval: 'month' }, aggregations: { sales: { sum: { field: 'price' } }, percent_of_total_sales: { normalize: { buckets_path: 'sales', method: 'percent_of_sum', format: '00.00%' } } } } } } ) puts response
const response = await client.search({ index: "sales", size: 0, aggs: { sales_per_month: { date_histogram: { field: "date", calendar_interval: "month", }, aggs: { sales: { sum: { field: "price", }, }, percent_of_total_sales: { normalize: { buckets_path: "sales", method: "percent_of_sum", format: "00.00%", }, }, }, }, }, }); console.log(response);
POST /sales/_search { "size": 0, "aggs": { "sales_per_month": { "date_histogram": { "field": "date", "calendar_interval": "month" }, "aggs": { "sales": { "sum": { "field": "price" } }, "percent_of_total_sales": { "normalize": { "buckets_path": "sales", "method": "percent_of_sum", "format": "00.00%" } } } } } }
|
|
|
|
|
以下可能是响应
{ "took": 11, "timed_out": false, "_shards": ..., "hits": ..., "aggregations": { "sales_per_month": { "buckets": [ { "key_as_string": "2015/01/01 00:00:00", "key": 1420070400000, "doc_count": 3, "sales": { "value": 550.0 }, "percent_of_total_sales": { "value": 0.5583756345177665, "value_as_string": "55.84%" } }, { "key_as_string": "2015/02/01 00:00:00", "key": 1422748800000, "doc_count": 2, "sales": { "value": 60.0 }, "percent_of_total_sales": { "value": 0.06091370558375635, "value_as_string": "06.09%" } }, { "key_as_string": "2015/03/01 00:00:00", "key": 1425168000000, "doc_count": 2, "sales": { "value": 375.0 }, "percent_of_total_sales": { "value": 0.38071065989847713, "value_as_string": "38.07%" } } ] } } }