地理中心点聚合编辑

一种度量聚合,它计算所有地理字段的坐标值的加权中心点

示例

response = client.indices.create(
  index: 'museums',
  body: {
    mappings: {
      properties: {
        location: {
          type: 'geo_point'
        }
      }
    }
  }
)
puts response

response = client.bulk(
  index: 'museums',
  refresh: true,
  body: [
    {
      index: {
        _id: 1
      }
    },
    {
      location: 'POINT (4.912350 52.374081)',
      city: 'Amsterdam',
      name: 'NEMO Science Museum'
    },
    {
      index: {
        _id: 2
      }
    },
    {
      location: 'POINT (4.901618 52.369219)',
      city: 'Amsterdam',
      name: 'Museum Het Rembrandthuis'
    },
    {
      index: {
        _id: 3
      }
    },
    {
      location: 'POINT (4.914722 52.371667)',
      city: 'Amsterdam',
      name: 'Nederlands Scheepvaartmuseum'
    },
    {
      index: {
        _id: 4
      }
    },
    {
      location: 'POINT (4.405200 51.222900)',
      city: 'Antwerp',
      name: 'Letterenhuis'
    },
    {
      index: {
        _id: 5
      }
    },
    {
      location: 'POINT (2.336389 48.861111)',
      city: 'Paris',
      name: 'Musée du Louvre'
    },
    {
      index: {
        _id: 6
      }
    },
    {
      location: 'POINT (2.327000 48.860000)',
      city: 'Paris',
      name: "Musée d'Orsay"
    }
  ]
)
puts response

response = client.search(
  index: 'museums',
  size: 0,
  body: {
    aggregations: {
      centroid: {
        geo_centroid: {
          field: 'location'
        }
      }
    }
  }
)
puts response
PUT /museums
{
  "mappings": {
    "properties": {
      "location": {
        "type": "geo_point"
      }
    }
  }
}

POST /museums/_bulk?refresh
{"index":{"_id":1}}
{"location": "POINT (4.912350 52.374081)", "city": "Amsterdam", "name": "NEMO Science Museum"}
{"index":{"_id":2}}
{"location": "POINT (4.901618 52.369219)", "city": "Amsterdam", "name": "Museum Het Rembrandthuis"}
{"index":{"_id":3}}
{"location": "POINT (4.914722 52.371667)", "city": "Amsterdam", "name": "Nederlands Scheepvaartmuseum"}
{"index":{"_id":4}}
{"location": "POINT (4.405200 51.222900)", "city": "Antwerp", "name": "Letterenhuis"}
{"index":{"_id":5}}
{"location": "POINT (2.336389 48.861111)", "city": "Paris", "name": "Musée du Louvre"}
{"index":{"_id":6}}
{"location": "POINT (2.327000 48.860000)", "city": "Paris", "name": "Musée d'Orsay"}

POST /museums/_search?size=0
{
  "aggs": {
    "centroid": {
      "geo_centroid": {
        "field": "location" 
      }
    }
  }
}

geo_centroid 聚合指定用于计算中心点的字段。(注意:字段必须是Geopoint 类型)

上面的聚合演示了如何计算所有博物馆文档的位置字段的中心点。

上面聚合的响应

{
  ...
  "aggregations": {
    "centroid": {
      "location": {
        "lat": 51.00982965203002,
        "lon": 3.9662131341174245
      },
      "count": 6
    }
  }
}

geo_centroid 聚合与其他桶聚合组合使用时更有趣。

示例

response = client.search(
  index: 'museums',
  size: 0,
  body: {
    aggregations: {
      cities: {
        terms: {
          field: 'city.keyword'
        },
        aggregations: {
          centroid: {
            geo_centroid: {
              field: 'location'
            }
          }
        }
      }
    }
  }
)
puts response
POST /museums/_search?size=0
{
  "aggs": {
    "cities": {
      "terms": { "field": "city.keyword" },
      "aggs": {
        "centroid": {
          "geo_centroid": { "field": "location" }
        }
      }
    }
  }
}

上面的示例使用geo_centroid 作为terms 桶聚合的子聚合,以查找每个城市中博物馆的中心位置。

上面聚合的响应

{
  ...
  "aggregations": {
    "cities": {
      "sum_other_doc_count": 0,
      "doc_count_error_upper_bound": 0,
      "buckets": [
        {
          "key": "Amsterdam",
          "doc_count": 3,
          "centroid": {
            "location": {
              "lat": 52.371655656024814,
              "lon": 4.909563297405839
            },
            "count": 3
          }
        },
        {
          "key": "Paris",
          "doc_count": 2,
          "centroid": {
            "location": {
              "lat": 48.86055548675358,
              "lon": 2.3316944623366
            },
            "count": 2
          }
        },
        {
          "key": "Antwerp",
          "doc_count": 1,
          "centroid": {
            "location": {
              "lat": 51.22289997059852,
              "lon": 4.40519998781383
            },
            "count": 1
          }
        }
      ]
    }
  }
}

geo_shape 字段的地理中心点聚合编辑

地理形状的中心点度量比点的中心点度量更细致。包含形状的特定聚合桶的中心点是桶中最高维度形状类型的中心点。例如,如果一个桶包含由多边形和线组成的形状,那么线不参与中心点度量的计算。每种形状类型的中心点计算方式不同。通过Circle 摄取的信封和圆圈被视为多边形。

几何类型 中心点计算

[Multi]Point

所有坐标的等权平均值

[Multi]LineString

每段所有中心点的加权平均值,其中每段的权重是其以度为单位的长度

[Multi]Polygon

所有三角形的所有中心点的加权平均值,其中三角形由每两个连续顶点和起点形成。孔具有负权重。权重代表以 deg^2 为单位计算的三角形的面积

GeometryCollection

所有具有最高维度的底层几何图形的中心点。如果有多边形和线和/或点,则忽略线和/或点。如果只有线和点,则忽略点

示例

response = client.indices.create(
  index: 'places',
  body: {
    mappings: {
      properties: {
        geometry: {
          type: 'geo_shape'
        }
      }
    }
  }
)
puts response

response = client.bulk(
  index: 'places',
  refresh: true,
  body: [
    {
      index: {
        _id: 1
      }
    },
    {
      name: 'NEMO Science Museum',
      geometry: 'POINT(4.912350 52.374081)'
    },
    {
      index: {
        _id: 2
      }
    },
    {
      name: 'Sportpark De Weeren',
      geometry: {
        type: 'Polygon',
        coordinates: [
          [
            [
              4.965305328369141,
              52.39347642069457
            ],
            [
              4.966979026794433,
              52.391721758934835
            ],
            [
              4.969425201416015,
              52.39238958618537
            ],
            [
              4.967944622039794,
              52.39420969150824
            ],
            [
              4.965305328369141,
              52.39347642069457
            ]
          ]
        ]
      }
    }
  ]
)
puts response

response = client.search(
  index: 'places',
  size: 0,
  body: {
    aggregations: {
      centroid: {
        geo_centroid: {
          field: 'geometry'
        }
      }
    }
  }
)
puts response
PUT /places
{
  "mappings": {
    "properties": {
      "geometry": {
        "type": "geo_shape"
      }
    }
  }
}

POST /places/_bulk?refresh
{"index":{"_id":1}}
{"name": "NEMO Science Museum", "geometry": "POINT(4.912350 52.374081)" }
{"index":{"_id":2}}
{"name": "Sportpark De Weeren", "geometry": { "type": "Polygon", "coordinates": [ [ [ 4.965305328369141, 52.39347642069457 ], [ 4.966979026794433, 52.391721758934835 ], [ 4.969425201416015, 52.39238958618537 ], [ 4.967944622039794, 52.39420969150824 ], [ 4.965305328369141, 52.39347642069457 ] ] ] } }

POST /places/_search?size=0
{
  "aggs": {
    "centroid": {
      "geo_centroid": {
        "field": "geometry"
      }
    }
  }
}
{
  ...
  "aggregations": {
    "centroid": {
      "location": {
        "lat": 52.39296147599816,
        "lon": 4.967404240742326
      },
      "count": 2
    }
  }
}

使用geo_centroid 作为geohash_grid 的子聚合

geohash_grid 聚合将文档(而不是单个地理点)放入桶中。如果文档的geo_point 字段包含多个值,则该文档可能会被分配到多个桶中,即使其一个或多个地理点位于桶边界之外。

如果还使用了geocentroid 子聚合,则每个中心点都是使用桶中的所有地理点计算的,包括位于桶边界之外的那些地理点。这会导致中心点位于桶边界之外。