AWS CloudWatch 度量集
编辑AWS CloudWatch 度量集
编辑AWS 模块的 CloudWatch 度量集允许您监控 AWS 上的各种服务。 cloudwatch
度量集通过调用 GetMetricData
API 定期获取给定命名空间的指标。
AWS 权限
编辑IAM 用户需要一些特定的 AWS 权限才能收集 AWS CloudWatch 指标。
ec2:DescribeRegions cloudwatch:GetMetricData cloudwatch:ListMetrics tag:getResources sts:GetCallerIdentity iam:ListAccountAliases
度量集特定的配置说明
编辑- namespace: ListMetrics API 用于过滤的命名空间。例如,AWS/EC2、AWS/S3。如果为命名空间提供通配符 *,则将自动收集所有命名空间的指标。
- name: 要过滤的指标名称。例如,EC2 实例的 CPUUtilization。
- dimensions: 要过滤的维度。例如,InstanceId=i-123。
- resource_type: 您要返回的资源的约束条件。每个资源类型的格式为 service[:resourceType]。例如,指定 ec2 的资源类型将返回所有 Amazon EC2 资源(包括 EC2 实例)。指定 ec2:instance 的资源类型仅返回 EC2 实例。
- statistic: 统计信息是在指定时间段内对指标数据进行聚合。默认情况下,统计信息包括平均值、总和、计数、最大值和最小值。
配置示例
编辑为了更专注于 cloudwatch
度量集的使用案例,以下示例不包含 AWS 凭证的配置。有关在配置中设置 AWS 凭证以使此度量集能够正确调用 AWS API 的更多详细信息,请参阅 AWS 凭证选项。
示例 1
编辑- module: aws period: 300s metricsets: - cloudwatch tags_filter: - key: "Organization" value: "Engineering" metrics: - namespace: AWS/EBS - namespace: AWS/ELB resource_type: elasticloadbalancing - namespace: AWS/EC2 name: CPUUtilization statistic: ["Average"] dimensions: - name: InstanceId value: i-0686946e22cf9494a
用户可以配置 |
|
|
|
如果收集了标签(对于指定了 |
|
如果用户确切知道要收集哪些 CloudWatch 指标,则可以使用此配置格式。需要指定 |
示例 2
编辑- module: aws period: 300s metricsets: - cloudwatch metrics: - namespace: "*"
使用此配置,将从 CloudWatch 收集所有命名空间的指标。这里的限制是所有命名空间的收集周期都设置为相同,在本例中为 300 秒。这将导致 API 调用额外费用或数据丢失。例如,来自命名空间 AWS/Usage 的指标每 1 分钟发送到 CloudWatch 一次。当收集周期等于 300 秒时,中间的数据点将丢失。来自命名空间 AWS/Billing 的指标每隔几个小时发送到 CloudWatch 一次。通过每 300 秒从 AWS/Billing 命名空间查询,将产生额外费用。
示例 3
编辑根据配置和 AWS 账户中的服务数量,API 调用的数量可能会变得过大,从而导致高 API 成本。为了减少 API 调用的数量,我们建议用户使用以下配置作为示例。
- metrics.name: 只收集对您的用例有用的指标子列表。
- metrics.statistic: 默认情况下,CloudWatch 度量集将进行 API 调用以获取所有统计信息,例如平均值、最大值、最小值、总和等。如果用户知道哪种统计方法最有帮助,请在配置中指定它。
-
metrics.dimensions: 不同的 AWS 服务在其 CloudWatch 指标中报告不同的维度。例如,EMR 指标 可以具有
JobFlowId
维度或JobId
维度。如果用户知道哪个特定维度有用,则可以在此配置选项中指定它。
- module: aws period: 5m metricsets: - cloudwatch regions: us-east-1 metrics: - namespace: AWS/ElasticMapReduce name: ["S3BytesWritten", "S3BytesRead", "HDFSUtilization", "TotalLoad"] resource_type: elasticmapreduce statistic: ["Average"] dimensions: - name: JobId value: "*"
更多示例
编辑使用以下配置,用户将能够从 EBS、ELB 和 EC2 收集 CloudWatch 指标,而无需标签信息。
- module: aws period: 300s metricsets: - cloudwatch metrics: - namespace: AWS/EBS - namespace: AWS/ELB - namespace: AWS/EC2
使用以下配置,用户将能够从 EBS、ELB 和 EC2 收集 CloudWatch 指标以及来自这些服务的标签。
- module: aws period: 300s metricsets: - cloudwatch metrics: - namespace: AWS/EBS resource_type: ebs - namespace: AWS/ELB resource_type: elasticloadbalancing - namespace: AWS/EC2 resource_type: ec2:instance
使用以下配置,用户将能够收集特定的 CloudWatch 指标。例如,来自 EC2 实例 i-123 的 CPUUtilization 指标(平均值)和来自 EC2 实例 i-456 的 NetworkIn 指标(平均值)。
- module: aws period: 300s metricsets: - cloudwatch metrics: - namespace: AWS/EC2 name: ["CPUUtilization"] resource_type: ec2:instance dimensions: - name: InstanceId value: i-123 statistic: ["Average"] - namespace: AWS/EC2 name: ["NetworkIn"] dimensions: - name: InstanceId value: i-456 statistic: ["Average"]
使用以下配置,用户可以过滤掉只有 LoadBalacer
和 TargetGroup
维度指标,指标名称为 UnHealthyHostCount
,LoadBalacer
和 TargetGroup
值可以是任意值。
- module: aws period: 300s metricsets: - cloudwatch metrics: - namespace: AWS/ApplicationELB statistic: ['Maximum'] name: ['UnHealthyHostCount'] dimensions: - name: LoadBalancer value: "*" - name: TargetGroup value: "*" resource_type: elasticloadbalancing
这是一个默认的度量集。如果主机模块未配置,则默认情况下启用此度量集。
有关度量集中每个字段的描述,请参阅 导出字段 部分。
这是一个由该度量集生成的示例文档
{ "@timestamp": "2017-10-12T08:05:34.853Z", "aws": { "cloudwatch": { "namespace": "AWS/RDS" }, "dimensions": { "DBClusterIdentifier": "database-1", "Role": "READER" }, "rds": { "metrics": { "AbortedClients": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "ActiveTransactions": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "AuroraBinlogReplicaLag": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "AuroraReplicaLag": { "avg": 18.4158, "count": 5, "max": 23.787, "min": 10.634, "sum": 92.07900000000001 }, "AuroraVolumeBytesLeftTotal": { "avg": 70007366615040, "count": 5, "max": 70007366615040, "min": 70007366615040, "sum": 350036833075200 }, "Aurora_pq_request_attempted": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "Aurora_pq_request_executed": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "Aurora_pq_request_failed": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "Aurora_pq_request_in_progress": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "Aurora_pq_request_not_chosen": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "Aurora_pq_request_not_chosen_below_min_rows": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "Aurora_pq_request_not_chosen_few_pages_outside_buffer_pool": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "Aurora_pq_request_not_chosen_long_trx": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "Aurora_pq_request_not_chosen_pq_high_buffer_pool_pct": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "Aurora_pq_request_not_chosen_small_table": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "Aurora_pq_request_not_chosen_unsupported_access": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "Aurora_pq_request_throttled": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "BlockedTransactions": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "BufferCacheHitRatio": { "avg": 100, "count": 5, "max": 100, "min": 100, "sum": 500 }, "CPUUtilization": { "avg": 6.051666111792592, "count": 5, "max": 6.216563057282379, "min": 5.808333333333334, "sum": 30.25833055896296 }, "CommitLatency": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "CommitThroughput": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "ConnectionAttempts": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "DDLLatency": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "DDLThroughput": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "DMLLatency": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "DMLThroughput": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "DatabaseConnections": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "Deadlocks": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "DeleteLatency": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "DeleteThroughput": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "EBSByteBalance%": { "avg": 99, "count": 1, "max": 99, "min": 99, "sum": 99 }, "EBSIOBalance%": { "avg": 99, "count": 1, "max": 99, "min": 99, "sum": 99 }, "EngineUptime": { "avg": 20800826, "count": 5, "max": 20800946, "min": 20800706, "sum": 104004130 }, "FreeLocalStorage": { "avg": 29682751078.4, "count": 5, "max": 29682819072, "min": 29682675712, "sum": 148413755392 }, "FreeableMemory": { "avg": 4639068160, "count": 5, "max": 4639838208, "min": 4638638080, "sum": 23195340800 }, "InsertLatency": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "InsertThroughput": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "LoginFailures": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "NetworkReceiveThroughput": { "avg": 0.8399323667305664, "count": 5, "max": 1.399556807011113, "min": 0.6999533364442371, "sum": 4.199661833652832 }, "NetworkThroughput": { "avg": 1.6798647334611327, "count": 5, "max": 2.799113614022226, "min": 1.3999066728884741, "sum": 8.399323667305664 }, "NetworkTransmitThroughput": { "avg": 0.8399323667305664, "count": 5, "max": 1.399556807011113, "min": 0.6999533364442371, "sum": 4.199661833652832 }, "NumBinaryLogFiles": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "Queries": { "avg": 6.3836833181909265, "count": 5, "max": 6.53289780681288, "min": 6.184260972479205, "sum": 31.91841659095463 }, "ReadLatency": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "ResultSetCacheHitRatio": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "RollbackSegmentHistoryListLength": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "RowLockTime": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "SelectLatency": { "avg": 0.2519199153394592, "count": 5, "max": 0.2609050632911392, "min": 0.24367924528301885, "sum": 1.2595995766972958 }, "SelectThroughput": { "avg": 2.6002296989354514, "count": 5, "max": 2.650618477644784, "min": 2.5335866920025336, "sum": 13.001148494677256 }, "SumBinaryLogSize": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "UpdateLatency": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "UpdateThroughput": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "WriteLatency": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 } } } }, "cloud": { "account": { "id": "428152502467", "name": "elastic-beats" }, "provider": "aws", "region": "eu-west-1" }, "event": { "dataset": "aws.cloudwatch", "duration": 115000, "module": "aws" }, "metricset": { "name": "cloudwatch", "period": 10000 }, "service": { "type": "aws" } }