检索内部命中

编辑

父子连接嵌套 功能允许返回在不同作用域中匹配的文档。在父/子关系中,父文档基于子文档中的匹配项返回,或子文档基于父文档中的匹配项返回。在嵌套关系中,文档基于嵌套的内部对象中的匹配项返回。

在这两种情况下,导致文档返回的不同作用域中的实际匹配项都被隐藏了。在许多情况下,了解哪些内部嵌套对象(在嵌套的情况下)或子/父文档(在父/子关系的情况下)导致返回特定信息非常有用。内部命中功能可以用于此目的。此功能在搜索响应中为每个搜索命中返回额外的嵌套命中,这些嵌套命中导致搜索命中在不同的作用域中匹配。

可以通过在 nestedhas_childhas_parent 查询和过滤器上定义一个 inner_hits 定义来使用内部命中。其结构如下所示:

"<query>" : {
    "inner_hits" : {
        <inner_hits_options>
    }
}

如果 inner_hits 是在支持它的查询上定义的,则每个搜索命中将包含一个 inner_hits json 对象,其结构如下所示:

"hits": [
     {
        "_index": ...,
        "_type": ...,
        "_id": ...,
        "inner_hits": {
           "<inner_hits_name>": {
              "hits": {
                 "total": ...,
                 "hits": [
                    {
                       "_id": ...,
                       ...
                    },
                    ...
                 ]
              }
           }
        },
        ...
     },
     ...
]

选项

编辑

内部命中支持以下选项:

from

为返回的常规搜索命中中的每个 inner_hits 获取的第一个命中的偏移量。

size

每个 inner_hits 返回的最大命中数。默认情况下,返回前三个匹配的命中。

sort

每个 inner_hits 应如何对内部命中进行排序。默认情况下,命中按分数排序。

name

在响应中用于特定内部命中定义的名称。当在单个搜索请求中定义了多个内部命中时非常有用。默认值取决于在哪个查询中定义了内部命中。对于 has_child 查询和过滤器,这是子类型;对于 has_parent 查询和过滤器,这是父类型;对于嵌套查询和过滤器,这是嵌套路径。

内部命中还支持以下每个文档的功能:

嵌套内部命中

编辑

可以使用嵌套的 inner_hits 将嵌套的内部对象作为内部命中包含在搜索命中中。

resp = client.indices.create(
    index="test",
    mappings={
        "properties": {
            "comments": {
                "type": "nested"
            }
        }
    },
)
print(resp)

resp1 = client.index(
    index="test",
    id="1",
    refresh=True,
    document={
        "title": "Test title",
        "comments": [
            {
                "author": "kimchy",
                "number": 1
            },
            {
                "author": "nik9000",
                "number": 2
            }
        ]
    },
)
print(resp1)

resp2 = client.search(
    index="test",
    query={
        "nested": {
            "path": "comments",
            "query": {
                "match": {
                    "comments.number": 2
                }
            },
            "inner_hits": {}
        }
    },
)
print(resp2)
response = client.indices.create(
  index: 'test',
  body: {
    mappings: {
      properties: {
        comments: {
          type: 'nested'
        }
      }
    }
  }
)
puts response

response = client.index(
  index: 'test',
  id: 1,
  refresh: true,
  body: {
    title: 'Test title',
    comments: [
      {
        author: 'kimchy',
        number: 1
      },
      {
        author: 'nik9000',
        number: 2
      }
    ]
  }
)
puts response

response = client.search(
  index: 'test',
  body: {
    query: {
      nested: {
        path: 'comments',
        query: {
          match: {
            'comments.number' => 2
          }
        },
        inner_hits: {}
      }
    }
  }
)
puts response
const response = await client.indices.create({
  index: "test",
  mappings: {
    properties: {
      comments: {
        type: "nested",
      },
    },
  },
});
console.log(response);

const response1 = await client.index({
  index: "test",
  id: 1,
  refresh: "true",
  document: {
    title: "Test title",
    comments: [
      {
        author: "kimchy",
        number: 1,
      },
      {
        author: "nik9000",
        number: 2,
      },
    ],
  },
});
console.log(response1);

const response2 = await client.search({
  index: "test",
  query: {
    nested: {
      path: "comments",
      query: {
        match: {
          "comments.number": 2,
        },
      },
      inner_hits: {},
    },
  },
});
console.log(response2);
PUT test
{
  "mappings": {
    "properties": {
      "comments": {
        "type": "nested"
      }
    }
  }
}

PUT test/_doc/1?refresh
{
  "title": "Test title",
  "comments": [
    {
      "author": "kimchy",
      "number": 1
    },
    {
      "author": "nik9000",
      "number": 2
    }
  ]
}

POST test/_search
{
  "query": {
    "nested": {
      "path": "comments",
      "query": {
        "match": {"comments.number" : 2}
      },
      "inner_hits": {} 
    }
  }
}

嵌套查询中的内部命中定义。无需定义其他选项。

一个响应片段的示例,该片段可以从上面的搜索请求生成:

{
  ...,
  "hits": {
    "total" : {
        "value": 1,
        "relation": "eq"
    },
    "max_score": 1.0,
    "hits": [
      {
        "_index": "test",
        "_id": "1",
        "_score": 1.0,
        "_source": ...,
        "inner_hits": {
          "comments": { 
            "hits": {
              "total" : {
                  "value": 1,
                  "relation": "eq"
              },
              "max_score": 1.0,
              "hits": [
                {
                  "_index": "test",
                  "_id": "1",
                  "_nested": {
                    "field": "comments",
                    "offset": 1
                  },
                  "_score": 1.0,
                  "_source": {
                    "author": "nik9000",
                    "number": 2
                  }
                }
              ]
            }
          }
        }
      }
    ]
  }
}

搜索请求中内部命中定义中使用的名称。可以使用 name 选项使用自定义键。

_nested 元数据在上面的示例中至关重要,因为它定义了此内部命中来自哪个内部嵌套对象。field 定义了嵌套命中来自的对象数组字段,而 offset 相对于其在 _source 中的位置。由于排序和评分,inner_hits 中命中对象的实际位置通常与定义嵌套内部对象的位置不同。

默认情况下,_source 也为 inner_hits 中的命中对象返回,但这可以更改。可以通过 _source 过滤功能返回或禁用部分源。如果在嵌套级别定义了存储字段,也可以通过 fields 功能返回这些字段。

一个重要的默认值是,在 inner_hits 内部的命中中返回的 _source 是相对于 _nested 元数据的。因此,在上面的示例中,每个嵌套命中仅返回注释部分,而不是包含注释的顶级文档的整个源。

嵌套内部命中和 _source
编辑

嵌套文档没有 _source 字段,因为文档的整个源存储在根文档的 _source 字段下。要仅包含嵌套文档的源,将解析根文档的源,并且仅将嵌套文档的相关位作为内部命中中的源包含。对每个匹配的嵌套文档执行此操作会影响执行整个搜索请求所花费的时间,尤其是在 size 和内部命中的 size 设置高于默认值时。为了避免相对昂贵的嵌套内部命中的源提取,可以禁用包含源,而仅依赖于 doc value 字段。像这样:

resp = client.indices.create(
    index="test",
    mappings={
        "properties": {
            "comments": {
                "type": "nested"
            }
        }
    },
)
print(resp)

resp1 = client.index(
    index="test",
    id="1",
    refresh=True,
    document={
        "title": "Test title",
        "comments": [
            {
                "author": "kimchy",
                "text": "comment text"
            },
            {
                "author": "nik9000",
                "text": "words words words"
            }
        ]
    },
)
print(resp1)

resp2 = client.search(
    index="test",
    query={
        "nested": {
            "path": "comments",
            "query": {
                "match": {
                    "comments.text": "words"
                }
            },
            "inner_hits": {
                "_source": False,
                "docvalue_fields": [
                    "comments.text.keyword"
                ]
            }
        }
    },
)
print(resp2)
response = client.indices.create(
  index: 'test',
  body: {
    mappings: {
      properties: {
        comments: {
          type: 'nested'
        }
      }
    }
  }
)
puts response

response = client.index(
  index: 'test',
  id: 1,
  refresh: true,
  body: {
    title: 'Test title',
    comments: [
      {
        author: 'kimchy',
        text: 'comment text'
      },
      {
        author: 'nik9000',
        text: 'words words words'
      }
    ]
  }
)
puts response

response = client.search(
  index: 'test',
  body: {
    query: {
      nested: {
        path: 'comments',
        query: {
          match: {
            'comments.text' => 'words'
          }
        },
        inner_hits: {
          _source: false,
          docvalue_fields: [
            'comments.text.keyword'
          ]
        }
      }
    }
  }
)
puts response
const response = await client.indices.create({
  index: "test",
  mappings: {
    properties: {
      comments: {
        type: "nested",
      },
    },
  },
});
console.log(response);

const response1 = await client.index({
  index: "test",
  id: 1,
  refresh: "true",
  document: {
    title: "Test title",
    comments: [
      {
        author: "kimchy",
        text: "comment text",
      },
      {
        author: "nik9000",
        text: "words words words",
      },
    ],
  },
});
console.log(response1);

const response2 = await client.search({
  index: "test",
  query: {
    nested: {
      path: "comments",
      query: {
        match: {
          "comments.text": "words",
        },
      },
      inner_hits: {
        _source: false,
        docvalue_fields: ["comments.text.keyword"],
      },
    },
  },
});
console.log(response2);
PUT test
{
  "mappings": {
    "properties": {
      "comments": {
        "type": "nested"
      }
    }
  }
}

PUT test/_doc/1?refresh
{
  "title": "Test title",
  "comments": [
    {
      "author": "kimchy",
      "text": "comment text"
    },
    {
      "author": "nik9000",
      "text": "words words words"
    }
  ]
}

POST test/_search
{
  "query": {
    "nested": {
      "path": "comments",
      "query": {
        "match": {"comments.text" : "words"}
      },
      "inner_hits": {
        "_source" : false,
        "docvalue_fields" : [
          "comments.text.keyword"
        ]
      }
    }
  }
}

嵌套对象字段的层次结构级别和内部命中。

编辑

如果映射具有多个层次的嵌套对象字段,则可以通过点号表示的路径访问每个级别。例如,如果有一个 comments 嵌套字段,其中包含一个 votes 嵌套字段,并且应将 votes 直接与根命中一起返回,则可以定义以下路径:

resp = client.indices.create(
    index="test",
    mappings={
        "properties": {
            "comments": {
                "type": "nested",
                "properties": {
                    "votes": {
                        "type": "nested"
                    }
                }
            }
        }
    },
)
print(resp)

resp1 = client.index(
    index="test",
    id="1",
    refresh=True,
    document={
        "title": "Test title",
        "comments": [
            {
                "author": "kimchy",
                "text": "comment text",
                "votes": []
            },
            {
                "author": "nik9000",
                "text": "words words words",
                "votes": [
                    {
                        "value": 1,
                        "voter": "kimchy"
                    },
                    {
                        "value": -1,
                        "voter": "other"
                    }
                ]
            }
        ]
    },
)
print(resp1)

resp2 = client.search(
    index="test",
    query={
        "nested": {
            "path": "comments.votes",
            "query": {
                "match": {
                    "comments.votes.voter": "kimchy"
                }
            },
            "inner_hits": {}
        }
    },
)
print(resp2)
response = client.indices.create(
  index: 'test',
  body: {
    mappings: {
      properties: {
        comments: {
          type: 'nested',
          properties: {
            votes: {
              type: 'nested'
            }
          }
        }
      }
    }
  }
)
puts response

response = client.index(
  index: 'test',
  id: 1,
  refresh: true,
  body: {
    title: 'Test title',
    comments: [
      {
        author: 'kimchy',
        text: 'comment text',
        votes: []
      },
      {
        author: 'nik9000',
        text: 'words words words',
        votes: [
          {
            value: 1,
            voter: 'kimchy'
          },
          {
            value: -1,
            voter: 'other'
          }
        ]
      }
    ]
  }
)
puts response

response = client.search(
  index: 'test',
  body: {
    query: {
      nested: {
        path: 'comments.votes',
        query: {
          match: {
            'comments.votes.voter' => 'kimchy'
          }
        },
        inner_hits: {}
      }
    }
  }
)
puts response
const response = await client.indices.create({
  index: "test",
  mappings: {
    properties: {
      comments: {
        type: "nested",
        properties: {
          votes: {
            type: "nested",
          },
        },
      },
    },
  },
});
console.log(response);

const response1 = await client.index({
  index: "test",
  id: 1,
  refresh: "true",
  document: {
    title: "Test title",
    comments: [
      {
        author: "kimchy",
        text: "comment text",
        votes: [],
      },
      {
        author: "nik9000",
        text: "words words words",
        votes: [
          {
            value: 1,
            voter: "kimchy",
          },
          {
            value: -1,
            voter: "other",
          },
        ],
      },
    ],
  },
});
console.log(response1);

const response2 = await client.search({
  index: "test",
  query: {
    nested: {
      path: "comments.votes",
      query: {
        match: {
          "comments.votes.voter": "kimchy",
        },
      },
      inner_hits: {},
    },
  },
});
console.log(response2);
PUT test
{
  "mappings": {
    "properties": {
      "comments": {
        "type": "nested",
        "properties": {
          "votes": {
            "type": "nested"
          }
        }
      }
    }
  }
}

PUT test/_doc/1?refresh
{
  "title": "Test title",
  "comments": [
    {
      "author": "kimchy",
      "text": "comment text",
      "votes": []
    },
    {
      "author": "nik9000",
      "text": "words words words",
      "votes": [
        {"value": 1 , "voter": "kimchy"},
        {"value": -1, "voter": "other"}
      ]
    }
  ]
}

POST test/_search
{
  "query": {
    "nested": {
      "path": "comments.votes",
        "query": {
          "match": {
            "comments.votes.voter": "kimchy"
          }
        },
        "inner_hits" : {}
    }
  }
}

看起来会像这样:

{
  ...,
  "hits": {
    "total" : {
        "value": 1,
        "relation": "eq"
    },
    "max_score": 0.6931471,
    "hits": [
      {
        "_index": "test",
        "_id": "1",
        "_score": 0.6931471,
        "_source": ...,
        "inner_hits": {
          "comments.votes": { 
            "hits": {
              "total" : {
                  "value": 1,
                  "relation": "eq"
              },
              "max_score": 0.6931471,
              "hits": [
                {
                  "_index": "test",
                  "_id": "1",
                  "_nested": {
                    "field": "comments",
                    "offset": 1,
                    "_nested": {
                      "field": "votes",
                      "offset": 0
                    }
                  },
                  "_score": 0.6931471,
                  "_source": {
                    "value": 1,
                    "voter": "kimchy"
                  }
                }
              ]
            }
          }
        }
      }
    ]
  }
}

此间接引用仅支持嵌套内部命中。

父/子内部命中

编辑

可以使用父/子 inner_hits 来包含父项或子项

resp = client.indices.create(
    index="test",
    mappings={
        "properties": {
            "my_join_field": {
                "type": "join",
                "relations": {
                    "my_parent": "my_child"
                }
            }
        }
    },
)
print(resp)

resp1 = client.index(
    index="test",
    id="1",
    refresh=True,
    document={
        "number": 1,
        "my_join_field": "my_parent"
    },
)
print(resp1)

resp2 = client.index(
    index="test",
    id="2",
    routing="1",
    refresh=True,
    document={
        "number": 1,
        "my_join_field": {
            "name": "my_child",
            "parent": "1"
        }
    },
)
print(resp2)

resp3 = client.search(
    index="test",
    query={
        "has_child": {
            "type": "my_child",
            "query": {
                "match": {
                    "number": 1
                }
            },
            "inner_hits": {}
        }
    },
)
print(resp3)
response = client.indices.create(
  index: 'test',
  body: {
    mappings: {
      properties: {
        my_join_field: {
          type: 'join',
          relations: {
            my_parent: 'my_child'
          }
        }
      }
    }
  }
)
puts response

response = client.index(
  index: 'test',
  id: 1,
  refresh: true,
  body: {
    number: 1,
    my_join_field: 'my_parent'
  }
)
puts response

response = client.index(
  index: 'test',
  id: 2,
  routing: 1,
  refresh: true,
  body: {
    number: 1,
    my_join_field: {
      name: 'my_child',
      parent: '1'
    }
  }
)
puts response

response = client.search(
  index: 'test',
  body: {
    query: {
      has_child: {
        type: 'my_child',
        query: {
          match: {
            number: 1
          }
        },
        inner_hits: {}
      }
    }
  }
)
puts response
const response = await client.indices.create({
  index: "test",
  mappings: {
    properties: {
      my_join_field: {
        type: "join",
        relations: {
          my_parent: "my_child",
        },
      },
    },
  },
});
console.log(response);

const response1 = await client.index({
  index: "test",
  id: 1,
  refresh: "true",
  document: {
    number: 1,
    my_join_field: "my_parent",
  },
});
console.log(response1);

const response2 = await client.index({
  index: "test",
  id: 2,
  routing: 1,
  refresh: "true",
  document: {
    number: 1,
    my_join_field: {
      name: "my_child",
      parent: "1",
    },
  },
});
console.log(response2);

const response3 = await client.search({
  index: "test",
  query: {
    has_child: {
      type: "my_child",
      query: {
        match: {
          number: 1,
        },
      },
      inner_hits: {},
    },
  },
});
console.log(response3);
PUT test
{
  "mappings": {
    "properties": {
      "my_join_field": {
        "type": "join",
        "relations": {
          "my_parent": "my_child"
        }
      }
    }
  }
}

PUT test/_doc/1?refresh
{
  "number": 1,
  "my_join_field": "my_parent"
}

PUT test/_doc/2?routing=1&refresh
{
  "number": 1,
  "my_join_field": {
    "name": "my_child",
    "parent": "1"
  }
}

POST test/_search
{
  "query": {
    "has_child": {
      "type": "my_child",
      "query": {
        "match": {
          "number": 1
        }
      },
      "inner_hits": {}    
    }
  }
}

与嵌套示例中类似的内部命中定义。

一个响应片段的示例,该片段可以从上面的搜索请求生成:

{
  ...,
  "hits": {
    "total": {
      "value": 1,
      "relation": "eq"
    },
    "max_score": 1.0,
    "hits": [
      {
        "_index": "test",
        "_id": "1",
        "_score": 1.0,
        "_source": {
          "number": 1,
          "my_join_field": "my_parent"
        },
        "inner_hits": {
          "my_child": {
            "hits": {
              "total": {
                "value": 1,
                "relation": "eq"
              },
              "max_score": 1.0,
              "hits": [
                {
                  "_index": "test",
                  "_id": "2",
                  "_score": 1.0,
                  "_routing": "1",
                  "_source": {
                    "number": 1,
                    "my_join_field": {
                      "name": "my_child",
                      "parent": "1"
                    }
                  }
                }
              ]
            }
          }
        }
      }
    ]
  }
}