示例:基于精确值丰富您的数据

编辑

示例:基于精确值丰富您的数据

编辑

match enrich策略 基于精确值(例如电子邮件地址或ID)将丰富数据与传入文档匹配,使用 term 查询

以下示例创建一个 match enrich策略,该策略根据电子邮件地址将用户名和联系信息添加到传入文档中。然后,它将 match enrich策略添加到数据摄取管道中的处理器中。

使用 创建索引 API索引 API 创建源索引。

以下索引 API 请求创建一个源索引并将新文档索引到该索引中。

resp = client.index(
    index="users",
    id="1",
    refresh="wait_for",
    document={
        "email": "[email protected]",
        "first_name": "Mardy",
        "last_name": "Brown",
        "city": "New Orleans",
        "county": "Orleans",
        "state": "LA",
        "zip": 70116,
        "web": "mardy.asciidocsmith.com"
    },
)
print(resp)
response = client.index(
  index: 'users',
  id: 1,
  refresh: 'wait_for',
  body: {
    email: '[email protected]',
    first_name: 'Mardy',
    last_name: 'Brown',
    city: 'New Orleans',
    county: 'Orleans',
    state: 'LA',
    zip: 70_116,
    web: 'mardy.asciidocsmith.com'
  }
)
puts response
const response = await client.index({
  index: "users",
  id: 1,
  refresh: "wait_for",
  document: {
    email: "[email protected]",
    first_name: "Mardy",
    last_name: "Brown",
    city: "New Orleans",
    county: "Orleans",
    state: "LA",
    zip: 70116,
    web: "mardy.asciidocsmith.com",
  },
});
console.log(response);
PUT /users/_doc/1?refresh=wait_for
{
  "email": "[email protected]",
  "first_name": "Mardy",
  "last_name": "Brown",
  "city": "New Orleans",
  "county": "Orleans",
  "state": "LA",
  "zip": 70116,
  "web": "mardy.asciidocsmith.com"
}

使用创建 enrich 策略 API 创建具有 match 策略类型的 enrich 策略。此策略必须包括:

  • 一个或多个源索引
  • 一个 match_field,用于匹配传入文档的源索引中的字段
  • 您希望附加到传入文档的源索引中的丰富字段
resp = client.enrich.put_policy(
    name="users-policy",
    match={
        "indices": "users",
        "match_field": "email",
        "enrich_fields": [
            "first_name",
            "last_name",
            "city",
            "zip",
            "state"
        ]
    },
)
print(resp)
response = client.enrich.put_policy(
  name: 'users-policy',
  body: {
    match: {
      indices: 'users',
      match_field: 'email',
      enrich_fields: [
        'first_name',
        'last_name',
        'city',
        'zip',
        'state'
      ]
    }
  }
)
puts response
const response = await client.enrich.putPolicy({
  name: "users-policy",
  match: {
    indices: "users",
    match_field: "email",
    enrich_fields: ["first_name", "last_name", "city", "zip", "state"],
  },
});
console.log(response);
PUT /_enrich/policy/users-policy
{
  "match": {
    "indices": "users",
    "match_field": "email",
    "enrich_fields": ["first_name", "last_name", "city", "zip", "state"]
  }
}

使用 执行 enrich 策略 API 为策略创建 enrich 索引。

POST /_enrich/policy/users-policy/_execute?wait_for_completion=false

使用 创建或更新管道 API 创建数据摄取管道。在管道中,添加一个 enrich 处理器,其中包括:

  • 您的 enrich 策略。
  • 用于匹配 enrich 索引中文档的传入文档的 field
  • 用于存储传入文档的附加丰富数据的 target_field。此字段包含在 enrich 策略中指定的 match_fieldenrich_fields
resp = client.ingest.put_pipeline(
    id="user_lookup",
    processors=[
        {
            "enrich": {
                "description": "Add 'user' data based on 'email'",
                "policy_name": "users-policy",
                "field": "email",
                "target_field": "user",
                "max_matches": "1"
            }
        }
    ],
)
print(resp)
const response = await client.ingest.putPipeline({
  id: "user_lookup",
  processors: [
    {
      enrich: {
        description: "Add 'user' data based on 'email'",
        policy_name: "users-policy",
        field: "email",
        target_field: "user",
        max_matches: "1",
      },
    },
  ],
});
console.log(response);
PUT /_ingest/pipeline/user_lookup
{
  "processors" : [
    {
      "enrich" : {
        "description": "Add 'user' data based on 'email'",
        "policy_name": "users-policy",
        "field" : "email",
        "target_field": "user",
        "max_matches": "1"
      }
    }
  ]
}

使用数据摄取管道索引文档。传入文档应包含 enrich 处理器中指定的 field

resp = client.index(
    index="my-index-000001",
    id="my_id",
    pipeline="user_lookup",
    document={
        "email": "[email protected]"
    },
)
print(resp)
const response = await client.index({
  index: "my-index-000001",
  id: "my_id",
  pipeline: "user_lookup",
  document: {
    email: "[email protected]",
  },
});
console.log(response);
PUT /my-index-000001/_doc/my_id?pipeline=user_lookup
{
  "email": "[email protected]"
}

要验证 enrich 处理器是否匹配并附加了相应的字段数据,请使用 Get API 查看已索引的文档。

resp = client.get(
    index="my-index-000001",
    id="my_id",
)
print(resp)
response = client.get(
  index: 'my-index-000001',
  id: 'my_id'
)
puts response
const response = await client.get({
  index: "my-index-000001",
  id: "my_id",
});
console.log(response);
GET /my-index-000001/_doc/my_id

API 返回以下响应:

{
  "found": true,
  "_index": "my-index-000001",
  "_id": "my_id",
  "_version": 1,
  "_seq_no": 55,
  "_primary_term": 1,
  "_source": {
    "user": {
      "email": "[email protected]",
      "first_name": "Mardy",
      "last_name": "Brown",
      "zip": 70116,
      "city": "New Orleans",
      "state": "LA"
    },
    "email": "[email protected]"
  }
}