检索内部命中编辑

The 父-子连接嵌套 功能允许返回在不同范围内匹配的文档。在父/子情况下,根据子文档中的匹配项返回父文档,或根据父文档中的匹配项返回子文档。在嵌套情况下,根据嵌套内部对象中的匹配项返回文档。

在这两种情况下,导致返回文档的不同范围内的实际匹配项都隐藏了。在许多情况下,了解哪些内部嵌套对象(在嵌套情况下)或子/父文档(在父/子情况下)导致返回某些信息非常有用。内部命中功能可用于此目的。此功能在搜索响应中的每个搜索命中中返回额外的嵌套命中,这些命中导致搜索命中在不同范围内匹配。

内部命中可以通过在 nestedhas_childhas_parent 查询和过滤器上定义 inner_hits 定义来使用。结构如下

"<query>" : {
    "inner_hits" : {
        <inner_hits_options>
    }
}

如果在支持它的查询上定义了 inner_hits,则每个搜索命中将包含一个 inner_hits json 对象,其结构如下

"hits": [
     {
        "_index": ...,
        "_type": ...,
        "_id": ...,
        "inner_hits": {
           "<inner_hits_name>": {
              "hits": {
                 "total": ...,
                 "hits": [
                    {
                       "_id": ...,
                       ...
                    },
                    ...
                 ]
              }
           }
        },
        ...
     },
     ...
]

选项编辑

内部命中支持以下选项

from

返回的常规搜索命中中每个 inner_hits 的第一个要获取的命中的偏移量。

size

每个 inner_hits 要返回的最大命中数。默认情况下,返回前三个匹配的命中。

sort

每个 inner_hits 的内部命中应如何排序。默认情况下,命中按分数排序。

name

在响应中用于特定内部命中定义的名称。当在单个搜索请求中定义了多个内部命中时很有用。默认值取决于内部命中定义所在的查询。对于 has_child 查询和过滤器,这是子类型,has_parent 查询和过滤器,这是父类型,嵌套查询和过滤器,这是嵌套路径。

内部命中还支持以下每个文档功能

嵌套内部命中编辑

嵌套 inner_hits 可用于将嵌套内部对象作为内部命中包含到搜索命中中。

response = client.indices.create(
  index: 'test',
  body: {
    mappings: {
      properties: {
        comments: {
          type: 'nested'
        }
      }
    }
  }
)
puts response

response = client.index(
  index: 'test',
  id: 1,
  refresh: true,
  body: {
    title: 'Test title',
    comments: [
      {
        author: 'kimchy',
        number: 1
      },
      {
        author: 'nik9000',
        number: 2
      }
    ]
  }
)
puts response

response = client.search(
  index: 'test',
  body: {
    query: {
      nested: {
        path: 'comments',
        query: {
          match: {
            'comments.number' => 2
          }
        },
        inner_hits: {}
      }
    }
  }
)
puts response
PUT test
{
  "mappings": {
    "properties": {
      "comments": {
        "type": "nested"
      }
    }
  }
}

PUT test/_doc/1?refresh
{
  "title": "Test title",
  "comments": [
    {
      "author": "kimchy",
      "number": 1
    },
    {
      "author": "nik9000",
      "number": 2
    }
  ]
}

POST test/_search
{
  "query": {
    "nested": {
      "path": "comments",
      "query": {
        "match": {"comments.number" : 2}
      },
      "inner_hits": {} 
    }
  }
}

嵌套查询中的内部命中定义。无需定义其他选项。

上面搜索请求可能生成的响应片段示例

{
  ...,
  "hits": {
    "total" : {
        "value": 1,
        "relation": "eq"
    },
    "max_score": 1.0,
    "hits": [
      {
        "_index": "test",
        "_id": "1",
        "_score": 1.0,
        "_source": ...,
        "inner_hits": {
          "comments": { 
            "hits": {
              "total" : {
                  "value": 1,
                  "relation": "eq"
              },
              "max_score": 1.0,
              "hits": [
                {
                  "_index": "test",
                  "_id": "1",
                  "_nested": {
                    "field": "comments",
                    "offset": 1
                  },
                  "_score": 1.0,
                  "_source": {
                    "author": "nik9000",
                    "number": 2
                  }
                }
              ]
            }
          }
        }
      }
    ]
  }
}

在搜索请求中的内部命中定义中使用的名称。可以通过 name 选项使用自定义键。

上面的示例中,_nested 元数据至关重要,因为它定义了此内部命中来自哪个内部嵌套对象。The field 定义嵌套命中来自的对象数组字段,而 offset 相对于其在 _source 中的位置。由于排序和评分,inner_hits 中命中对象的实际位置通常与定义嵌套内部对象的位置不同。

默认情况下,_source 也将为 inner_hits 中的命中对象返回,但这可以更改。可以通过 _source 过滤功能部分返回源或禁用源。如果在嵌套级别定义了存储字段,则也可以通过 fields 功能返回这些字段。

一个重要的默认值是,在 inner_hits 中的命中内返回的 _source 相对于 _nested 元数据。因此,在上面的示例中,每个嵌套命中只返回评论部分,而不是包含评论的顶级文档的整个源。

嵌套内部命中和 _source编辑

嵌套文档没有 _source 字段,因为文档的整个源都与根文档一起存储在其 _source 字段下。要包含仅嵌套文档的源,将解析根文档的源,并将仅嵌套文档的相关部分作为源包含在内部命中中。对每个匹配的嵌套文档执行此操作会影响执行整个搜索请求所需的时间,尤其是在 size 和内部命中的 size 设置为高于默认值时。为了避免对嵌套内部命中进行相对昂贵的源提取,可以禁用包含源,并仅依赖 doc 值字段。像这样

response = client.indices.create(
  index: 'test',
  body: {
    mappings: {
      properties: {
        comments: {
          type: 'nested'
        }
      }
    }
  }
)
puts response

response = client.index(
  index: 'test',
  id: 1,
  refresh: true,
  body: {
    title: 'Test title',
    comments: [
      {
        author: 'kimchy',
        text: 'comment text'
      },
      {
        author: 'nik9000',
        text: 'words words words'
      }
    ]
  }
)
puts response

response = client.search(
  index: 'test',
  body: {
    query: {
      nested: {
        path: 'comments',
        query: {
          match: {
            'comments.text' => 'words'
          }
        },
        inner_hits: {
          _source: false,
          docvalue_fields: [
            'comments.text.keyword'
          ]
        }
      }
    }
  }
)
puts response
PUT test
{
  "mappings": {
    "properties": {
      "comments": {
        "type": "nested"
      }
    }
  }
}

PUT test/_doc/1?refresh
{
  "title": "Test title",
  "comments": [
    {
      "author": "kimchy",
      "text": "comment text"
    },
    {
      "author": "nik9000",
      "text": "words words words"
    }
  ]
}

POST test/_search
{
  "query": {
    "nested": {
      "path": "comments",
      "query": {
        "match": {"comments.text" : "words"}
      },
      "inner_hits": {
        "_source" : false,
        "docvalue_fields" : [
          "comments.text.keyword"
        ]
      }
    }
  }
}

嵌套对象字段和内部命中的层次结构级别。编辑

如果映射具有多个层次结构嵌套对象字段级别,则可以通过点表示法路径访问每个级别。例如,如果存在一个包含 votes 嵌套字段的 comments 嵌套字段,并且应直接使用根命中返回投票,则可以定义以下路径

response = client.indices.create(
  index: 'test',
  body: {
    mappings: {
      properties: {
        comments: {
          type: 'nested',
          properties: {
            votes: {
              type: 'nested'
            }
          }
        }
      }
    }
  }
)
puts response

response = client.index(
  index: 'test',
  id: 1,
  refresh: true,
  body: {
    title: 'Test title',
    comments: [
      {
        author: 'kimchy',
        text: 'comment text',
        votes: []
      },
      {
        author: 'nik9000',
        text: 'words words words',
        votes: [
          {
            value: 1,
            voter: 'kimchy'
          },
          {
            value: -1,
            voter: 'other'
          }
        ]
      }
    ]
  }
)
puts response

response = client.search(
  index: 'test',
  body: {
    query: {
      nested: {
        path: 'comments.votes',
        query: {
          match: {
            'comments.votes.voter' => 'kimchy'
          }
        },
        inner_hits: {}
      }
    }
  }
)
puts response
PUT test
{
  "mappings": {
    "properties": {
      "comments": {
        "type": "nested",
        "properties": {
          "votes": {
            "type": "nested"
          }
        }
      }
    }
  }
}

PUT test/_doc/1?refresh
{
  "title": "Test title",
  "comments": [
    {
      "author": "kimchy",
      "text": "comment text",
      "votes": []
    },
    {
      "author": "nik9000",
      "text": "words words words",
      "votes": [
        {"value": 1 , "voter": "kimchy"},
        {"value": -1, "voter": "other"}
      ]
    }
  ]
}

POST test/_search
{
  "query": {
    "nested": {
      "path": "comments.votes",
        "query": {
          "match": {
            "comments.votes.voter": "kimchy"
          }
        },
        "inner_hits" : {}
    }
  }
}

这将看起来像

{
  ...,
  "hits": {
    "total" : {
        "value": 1,
        "relation": "eq"
    },
    "max_score": 0.6931471,
    "hits": [
      {
        "_index": "test",
        "_id": "1",
        "_score": 0.6931471,
        "_source": ...,
        "inner_hits": {
          "comments.votes": { 
            "hits": {
              "total" : {
                  "value": 1,
                  "relation": "eq"
              },
              "max_score": 0.6931471,
              "hits": [
                {
                  "_index": "test",
                  "_id": "1",
                  "_nested": {
                    "field": "comments",
                    "offset": 1,
                    "_nested": {
                      "field": "votes",
                      "offset": 0
                    }
                  },
                  "_score": 0.6931471,
                  "_source": {
                    "value": 1,
                    "voter": "kimchy"
                  }
                }
              ]
            }
          }
        }
      }
    ]
  }
}

这种间接引用仅支持嵌套内部命中。

父/子内部命中编辑

父/子 inner_hits 可用于包含父或子

response = client.indices.create(
  index: 'test',
  body: {
    mappings: {
      properties: {
        my_join_field: {
          type: 'join',
          relations: {
            my_parent: 'my_child'
          }
        }
      }
    }
  }
)
puts response

response = client.index(
  index: 'test',
  id: 1,
  refresh: true,
  body: {
    number: 1,
    my_join_field: 'my_parent'
  }
)
puts response

response = client.index(
  index: 'test',
  id: 2,
  routing: 1,
  refresh: true,
  body: {
    number: 1,
    my_join_field: {
      name: 'my_child',
      parent: '1'
    }
  }
)
puts response

response = client.search(
  index: 'test',
  body: {
    query: {
      has_child: {
        type: 'my_child',
        query: {
          match: {
            number: 1
          }
        },
        inner_hits: {}
      }
    }
  }
)
puts response
PUT test
{
  "mappings": {
    "properties": {
      "my_join_field": {
        "type": "join",
        "relations": {
          "my_parent": "my_child"
        }
      }
    }
  }
}

PUT test/_doc/1?refresh
{
  "number": 1,
  "my_join_field": "my_parent"
}

PUT test/_doc/2?routing=1&refresh
{
  "number": 1,
  "my_join_field": {
    "name": "my_child",
    "parent": "1"
  }
}

POST test/_search
{
  "query": {
    "has_child": {
      "type": "my_child",
      "query": {
        "match": {
          "number": 1
        }
      },
      "inner_hits": {}    
    }
  }
}

与嵌套示例中的内部命中定义类似。

上面搜索请求可能生成的响应片段示例

{
  ...,
  "hits": {
    "total": {
      "value": 1,
      "relation": "eq"
    },
    "max_score": 1.0,
    "hits": [
      {
        "_index": "test",
        "_id": "1",
        "_score": 1.0,
        "_source": {
          "number": 1,
          "my_join_field": "my_parent"
        },
        "inner_hits": {
          "my_child": {
            "hits": {
              "total": {
                "value": 1,
                "relation": "eq"
              },
              "max_score": 1.0,
              "hits": [
                {
                  "_index": "test",
                  "_id": "2",
                  "_score": 1.0,
                  "_routing": "1",
                  "_source": {
                    "number": 1,
                    "my_join_field": {
                      "name": "my_child",
                      "parent": "1"
                    }
                  }
                }
              ]
            }
          }
        }
      }
    ]
  }
}