› › ›

推理桶聚合

编辑

推理桶聚合

编辑

一个父管道聚合，它加载预训练模型并对父桶聚合的整理结果字段执行推理。

要使用推理桶聚合，您需要具有使用获取训练模型 API 所需的相同安全权限。

语法

编辑

一个独立的 inference 聚合看起来像这样

{
  "inference": {
    "model_id": "a_model_for_inference", 
    "inference_config": { 
      "regression_config": {
        "num_top_feature_importance_values": 2
      }
    },
    "buckets_path": {
      "avg_cost": "avg_agg", 
      "max_cost": "max_agg"
    }
  }
}

	训练模型的唯一标识符或别名。
	可选的推理配置，它会覆盖模型的默认设置
	将 `avg_agg` 的值映射到模型的输入字段 `avg_cost`

表 63. inference 参数

参数名称	描述	必需	默认值
`model_id`	训练模型的 ID 或别名。	必需	-
`inference_config`	包含推理类型及其选项。有两种类型：`regression` 和 `classification`	可选	-
`buckets_path`	定义输入聚合的路径，并将聚合名称映射到模型期望的字段名称。有关更多详细信息，请参见 `buckets_path` 语法	必需	-

推理模型的配置选项

编辑

inference_config 设置是可选的，通常不需要，因为预训练模型配备了合理的默认值。在聚合的上下文中，可以为两种类型的模型覆盖一些选项。

回归模型的配置选项

编辑

num_top_feature_importance_values: (可选，整数) 指定每个文档的特征重要性值的最大数量。默认情况下，它为零，并且不进行特征重要性计算。

分类模型的配置选项

编辑

num_top_classes: (可选，整数) 指定要返回的顶级类预测的数量。默认为 0。
num_top_feature_importance_values: (可选，整数) 指定每个文档的特征重要性值的最大数量。默认为 0，这意味着不进行特征重要性计算。
prediction_field_type: (可选，字符串) 指定要写入的预测字段的类型。有效值是：string，number，boolean。当提供 boolean 时，1.0 将转换为 true，0.0 将转换为 false。

示例

编辑

以下代码片段按 client_ip 聚合 Web 日志，并通过指标和桶子聚合提取一些特征，作为推理聚合的输入，该推理聚合配置了一个训练好的模型来识别可疑的客户端 IP

resp = client.search(
    index="kibana_sample_data_logs",
    size=0,
    aggs={
        "client_ip": {
            "composite": {
                "sources": [
                    {
                        "client_ip": {
                            "terms": {
                                "field": "clientip"
                            }
                        }
                    }
                ]
            },
            "aggs": {
                "url_dc": {
                    "cardinality": {
                        "field": "url.keyword"
                    }
                },
                "bytes_sum": {
                    "sum": {
                        "field": "bytes"
                    }
                },
                "geo_src_dc": {
                    "cardinality": {
                        "field": "geo.src"
                    }
                },
                "geo_dest_dc": {
                    "cardinality": {
                        "field": "geo.dest"
                    }
                },
                "responses_total": {
                    "value_count": {
                        "field": "timestamp"
                    }
                },
                "success": {
                    "filter": {
                        "term": {
                            "response": "200"
                        }
                    }
                },
                "error404": {
                    "filter": {
                        "term": {
                            "response": "404"
                        }
                    }
                },
                "error503": {
                    "filter": {
                        "term": {
                            "response": "503"
                        }
                    }
                },
                "malicious_client_ip": {
                    "inference": {
                        "model_id": "malicious_clients_model",
                        "buckets_path": {
                            "response_count": "responses_total",
                            "url_dc": "url_dc",
                            "bytes_sum": "bytes_sum",
                            "geo_src_dc": "geo_src_dc",
                            "geo_dest_dc": "geo_dest_dc",
                            "success": "success._count",
                            "error404": "error404._count",
                            "error503": "error503._count"
                        }
                    }
                }
            }
        }
    },
)
print(resp)

response = client.search(
  index: 'kibana_sample_data_logs',
  body: {
    size: 0,
    aggregations: {
      client_ip: {
        composite: {
          sources: [
            {
              client_ip: {
                terms: {
                  field: 'clientip'
                }
              }
            }
          ]
        },
        aggregations: {
          url_dc: {
            cardinality: {
              field: 'url.keyword'
            }
          },
          bytes_sum: {
            sum: {
              field: 'bytes'
            }
          },
          geo_src_dc: {
            cardinality: {
              field: 'geo.src'
            }
          },
          geo_dest_dc: {
            cardinality: {
              field: 'geo.dest'
            }
          },
          responses_total: {
            value_count: {
              field: 'timestamp'
            }
          },
          success: {
            filter: {
              term: {
                response: '200'
              }
            }
          },
          "error404": {
            filter: {
              term: {
                response: '404'
              }
            }
          },
          "error503": {
            filter: {
              term: {
                response: '503'
              }
            }
          },
          malicious_client_ip: {
            inference: {
              model_id: 'malicious_clients_model',
              buckets_path: {
                response_count: 'responses_total',
                url_dc: 'url_dc',
                bytes_sum: 'bytes_sum',
                geo_src_dc: 'geo_src_dc',
                geo_dest_dc: 'geo_dest_dc',
                success: 'success._count',
                "error404": 'error404._count',
                "error503": 'error503._count'
              }
            }
          }
        }
      }
    }
  }
)
puts response

const response = await client.search({
  index: "kibana_sample_data_logs",
  size: 0,
  aggs: {
    client_ip: {
      composite: {
        sources: [
          {
            client_ip: {
              terms: {
                field: "clientip",
              },
            },
          },
        ],
      },
      aggs: {
        url_dc: {
          cardinality: {
            field: "url.keyword",
          },
        },
        bytes_sum: {
          sum: {
            field: "bytes",
          },
        },
        geo_src_dc: {
          cardinality: {
            field: "geo.src",
          },
        },
        geo_dest_dc: {
          cardinality: {
            field: "geo.dest",
          },
        },
        responses_total: {
          value_count: {
            field: "timestamp",
          },
        },
        success: {
          filter: {
            term: {
              response: "200",
            },
          },
        },
        error404: {
          filter: {
            term: {
              response: "404",
            },
          },
        },
        error503: {
          filter: {
            term: {
              response: "503",
            },
          },
        },
        malicious_client_ip: {
          inference: {
            model_id: "malicious_clients_model",
            buckets_path: {
              response_count: "responses_total",
              url_dc: "url_dc",
              bytes_sum: "bytes_sum",
              geo_src_dc: "geo_src_dc",
              geo_dest_dc: "geo_dest_dc",
              success: "success._count",
              error404: "error404._count",
              error503: "error503._count",
            },
          },
        },
      },
    },
  },
});
console.log(response);

GET kibana_sample_data_logs/_search
{
  "size": 0,
  "aggs": {
    "client_ip": { 
      "composite": {
        "sources": [
          {
            "client_ip": {
              "terms": {
                "field": "clientip"
              }
            }
          }
        ]
      },
      "aggs": { 
        "url_dc": {
          "cardinality": {
            "field": "url.keyword"
          }
        },
        "bytes_sum": {
          "sum": {
            "field": "bytes"
          }
        },
        "geo_src_dc": {
          "cardinality": {
            "field": "geo.src"
          }
        },
        "geo_dest_dc": {
          "cardinality": {
            "field": "geo.dest"
          }
        },
        "responses_total": {
          "value_count": {
            "field": "timestamp"
          }
        },
        "success": {
          "filter": {
            "term": {
              "response": "200"
            }
          }
        },
        "error404": {
          "filter": {
            "term": {
              "response": "404"
            }
          }
        },
        "error503": {
          "filter": {
            "term": {
              "response": "503"
            }
          }
        },
        "malicious_client_ip": { 
          "inference": {
            "model_id": "malicious_clients_model",
            "buckets_path": {
              "response_count": "responses_total",
              "url_dc": "url_dc",
              "bytes_sum": "bytes_sum",
              "geo_src_dc": "geo_src_dc",
              "geo_dest_dc": "geo_dest_dc",
              "success": "success._count",
              "error404": "error404._count",
              "error503": "error503._count"
            }
          }
        }
      }
    }
  }
}

Copy as curl Try in Elastic

	一个复合桶聚合，它按 `client_ip` 聚合数据。
	一系列指标和桶子聚合。
	推理桶聚合，指定训练的模型，并将聚合名称映射到模型的输入字段。

« 扩展统计桶聚合最大桶聚合 »

Was this helpful?

Feedback

The Search AI Company

ELK Stack

Elastic Cloud

Generative AI

Search

Security

Observability

By solution

Industries

Customer spotlight

Research

Build

Learn

Connect

推理桶聚合

推理桶聚合

语法

推理模型的配置选项

回归模型的配置选项

分类模型的配置选项

示例

Follow us

About us

Join us

Partners

Trust & Security

Investor relations

Excellence Awards

About us

Join us

Partners

Trust & Security

Investor relations

Excellence Awards