反向嵌套聚合

编辑

一种特殊的单桶聚合,允许对嵌套文档的父文档进行聚合。 实际上,这种聚合可以突破嵌套块结构,并链接到其他嵌套结构或根文档,从而允许嵌套不属于嵌套聚合中嵌套对象的其他聚合。

reverse_nested 聚合必须在 nested 聚合内定义。

选项

  • path - 定义应该连接回哪个嵌套对象字段。 默认值为空,这意味着它连接回根/主文档级别。 该路径不能包含对 nested 聚合嵌套结构之外的嵌套对象字段的引用,reverse_nested 位于其中。

例如,假设我们有一个包含问题和评论的票务系统的索引。 评论以嵌套文档的形式内联到问题文档中。 映射可能如下所示

resp = client.indices.create(
    index="issues",
    mappings={
        "properties": {
            "tags": {
                "type": "keyword"
            },
            "comments": {
                "type": "nested",
                "properties": {
                    "username": {
                        "type": "keyword"
                    },
                    "comment": {
                        "type": "text"
                    }
                }
            }
        }
    },
)
print(resp)
response = client.indices.create(
  index: 'issues',
  body: {
    mappings: {
      properties: {
        tags: {
          type: 'keyword'
        },
        comments: {
          type: 'nested',
          properties: {
            username: {
              type: 'keyword'
            },
            comment: {
              type: 'text'
            }
          }
        }
      }
    }
  }
)
puts response
const response = await client.indices.create({
  index: "issues",
  mappings: {
    properties: {
      tags: {
        type: "keyword",
      },
      comments: {
        type: "nested",
        properties: {
          username: {
            type: "keyword",
          },
          comment: {
            type: "text",
          },
        },
      },
    },
  },
});
console.log(response);
PUT /issues
{
  "mappings": {
    "properties": {
      "tags": { "type": "keyword" },
      "comments": {                            
        "type": "nested",
        "properties": {
          "username": { "type": "keyword" },
          "comment": { "type": "text" }
        }
      }
    }
  }
}

comments 是一个数组,它在 issue 对象下保存嵌套文档。

以下聚合将返回已评论的顶级评论者的用户名,以及每个顶级评论者评论的问题的顶级标签

resp = client.search(
    index="issues",
    query={
        "match_all": {}
    },
    aggs={
        "comments": {
            "nested": {
                "path": "comments"
            },
            "aggs": {
                "top_usernames": {
                    "terms": {
                        "field": "comments.username"
                    },
                    "aggs": {
                        "comment_to_issue": {
                            "reverse_nested": {},
                            "aggs": {
                                "top_tags_per_comment": {
                                    "terms": {
                                        "field": "tags"
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    },
)
print(resp)
response = client.search(
  index: 'issues',
  body: {
    query: {
      match_all: {}
    },
    aggregations: {
      comments: {
        nested: {
          path: 'comments'
        },
        aggregations: {
          top_usernames: {
            terms: {
              field: 'comments.username'
            },
            aggregations: {
              comment_to_issue: {
                reverse_nested: {},
                aggregations: {
                  top_tags_per_comment: {
                    terms: {
                      field: 'tags'
                    }
                  }
                }
              }
            }
          }
        }
      }
    }
  }
)
puts response
const response = await client.search({
  index: "issues",
  query: {
    match_all: {},
  },
  aggs: {
    comments: {
      nested: {
        path: "comments",
      },
      aggs: {
        top_usernames: {
          terms: {
            field: "comments.username",
          },
          aggs: {
            comment_to_issue: {
              reverse_nested: {},
              aggs: {
                top_tags_per_comment: {
                  terms: {
                    field: "tags",
                  },
                },
              },
            },
          },
        },
      },
    },
  },
});
console.log(response);
GET /issues/_search
{
  "query": {
    "match_all": {}
  },
  "aggs": {
    "comments": {
      "nested": {
        "path": "comments"
      },
      "aggs": {
        "top_usernames": {
          "terms": {
            "field": "comments.username"
          },
          "aggs": {
            "comment_to_issue": {
              "reverse_nested": {}, 
              "aggs": {
                "top_tags_per_comment": {
                  "terms": {
                    "field": "tags"
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}

如上所示,reverse_nested 聚合被放入 nested 聚合中,因为这是 DSL 中唯一可以使用 reverse_nested 聚合的地方。 它唯一目的是连接回嵌套结构中更上层的父文档。

由于没有定义 pathreverse_nested 聚合会连接回根/主文档级别。 通过 path 选项,如果映射中定义了多个分层嵌套对象类型,则 reverse_nested 聚合可以连接回不同的级别

可能的响应片段

{
  "aggregations": {
    "comments": {
      "doc_count": 1,
      "top_usernames": {
        "doc_count_error_upper_bound" : 0,
        "sum_other_doc_count" : 0,
        "buckets": [
          {
            "key": "username_1",
            "doc_count": 1,
            "comment_to_issue": {
              "doc_count": 1,
              "top_tags_per_comment": {
                "doc_count_error_upper_bound" : 0,
                "sum_other_doc_count" : 0,
                "buckets": [
                  {
                    "key": "tag_1",
                    "doc_count": 1
                  }
                  ...
                ]
              }
            }
          }
          ...
        ]
      }
    }
  }
}