统计聚合

编辑

一个 多值 度量聚合,用于计算从聚合文档中提取的数值的统计信息。

返回的统计信息包括:min(最小值)、max(最大值)、sum(总和)、count(计数)和 avg(平均值)。

假设数据由代表学生考试成绩(介于 0 到 100 之间)的文档组成。

resp = client.search(
    index="exams",
    size="0",
    aggs={
        "grades_stats": {
            "stats": {
                "field": "grade"
            }
        }
    },
)
print(resp)
response = client.search(
  index: 'exams',
  size: 0,
  body: {
    aggregations: {
      grades_stats: {
        stats: {
          field: 'grade'
        }
      }
    }
  }
)
puts response
const response = await client.search({
  index: "exams",
  size: 0,
  aggs: {
    grades_stats: {
      stats: {
        field: "grade",
      },
    },
  },
});
console.log(response);
POST /exams/_search?size=0
{
  "aggs": {
    "grades_stats": { "stats": { "field": "grade" } }
  }
}

上述聚合计算所有文档的成绩统计信息。聚合类型为 statsfield 设置定义了将计算统计信息的文档的数值字段。上述操作将返回以下内容

{
  ...

  "aggregations": {
    "grades_stats": {
      "count": 2,
      "min": 50.0,
      "max": 100.0,
      "avg": 75.0,
      "sum": 150.0
    }
  }
}

聚合的名称(上面的 grades_stats)也充当从返回的响应中检索聚合结果的键。

脚本

编辑

如果您需要获取比单个字段更复杂的 stats,请在运行时字段上运行聚合。

resp = client.search(
    index="exams",
    size=0,
    runtime_mappings={
        "grade.weighted": {
            "type": "double",
            "script": "\n        emit(doc['grade'].value * doc['weight'].value)\n      "
        }
    },
    aggs={
        "grades_stats": {
            "stats": {
                "field": "grade.weighted"
            }
        }
    },
)
print(resp)
response = client.search(
  index: 'exams',
  body: {
    size: 0,
    runtime_mappings: {
      'grade.weighted' => {
        type: 'double',
        script: "\n        emit(doc['grade'].value * doc['weight'].value)\n      "
      }
    },
    aggregations: {
      grades_stats: {
        stats: {
          field: 'grade.weighted'
        }
      }
    }
  }
)
puts response
const response = await client.search({
  index: "exams",
  size: 0,
  runtime_mappings: {
    "grade.weighted": {
      type: "double",
      script:
        "\n        emit(doc['grade'].value * doc['weight'].value)\n      ",
    },
  },
  aggs: {
    grades_stats: {
      stats: {
        field: "grade.weighted",
      },
    },
  },
});
console.log(response);
POST /exams/_search
{
  "size": 0,
  "runtime_mappings": {
    "grade.weighted": {
      "type": "double",
      "script": """
        emit(doc['grade'].value * doc['weight'].value)
      """
    }
  },
  "aggs": {
    "grades_stats": {
      "stats": {
        "field": "grade.weighted"
      }
    }
  }
}

缺失值

编辑

missing 参数定义了应如何处理缺少值的文档。默认情况下,它们将被忽略,但也可以将它们视为具有值。

resp = client.search(
    index="exams",
    size="0",
    aggs={
        "grades_stats": {
            "stats": {
                "field": "grade",
                "missing": 0
            }
        }
    },
)
print(resp)
response = client.search(
  index: 'exams',
  size: 0,
  body: {
    aggregations: {
      grades_stats: {
        stats: {
          field: 'grade',
          missing: 0
        }
      }
    }
  }
)
puts response
const response = await client.search({
  index: "exams",
  size: 0,
  aggs: {
    grades_stats: {
      stats: {
        field: "grade",
        missing: 0,
      },
    },
  },
});
console.log(response);
POST /exams/_search?size=0
{
  "aggs": {
    "grades_stats": {
      "stats": {
        "field": "grade",
        "missing": 0      
      }
    }
  }
}

grade 字段中没有值的文档将与具有值 0 的文档落在同一个存储桶中。