AWS CloudWatch 度量集
编辑AWS CloudWatch 度量集编辑
AWS 模块的 CloudWatch 度量集允许您监控 AWS 上的各种服务。 cloudwatch
度量集通过调用 GetMetricData
API 定期从给定命名空间获取度量指标。
AWS 权限编辑
IAM 用户需要一些特定的 AWS 权限才能收集 AWS CloudWatch 度量指标。
ec2:DescribeRegions cloudwatch:GetMetricData cloudwatch:ListMetrics tag:getResources sts:GetCallerIdentity iam:ListAccountAliases
度量集特定配置说明编辑
- 命名空间: ListMetrics API 用于过滤的命名空间。例如,AWS/EC2、AWS/S3。如果将通配符 * 指定为命名空间,则将自动收集来自所有命名空间的度量指标。
- 名称: 要过滤的度量指标的名称。例如,EC2 实例的 CPUUtilization。
- 维度: 要过滤的维度。例如,InstanceId=i-123。
- 资源类型: 您希望返回的资源的约束。每个资源类型的格式为 service[:resourceType]。例如,指定资源类型 ec2 将返回所有 Amazon EC2 资源(包括 EC2 实例)。指定资源类型 ec2:instance 只返回 EC2 实例。
- 统计: 统计是针对特定时间段内汇总的度量指标数据。默认情况下,统计包含平均值、总和、计数、最大值和最小值。
配置示例编辑
为了更专注于 cloudwatch
度量集的使用案例,以下示例不包含有关 AWS 凭据的配置。有关在配置中设置 AWS 凭据以使此度量集能够进行正确的 AWS API 调用的更多详细信息,请参阅 AWS 凭据选项。
示例 1编辑
- module: aws period: 300s metricsets: - cloudwatch tags_filter: - key: "Organization" value: "Engineering" metrics: - namespace: AWS/EBS - namespace: AWS/ELB resource_type: elasticloadbalancing - namespace: AWS/EC2 name: CPUUtilization statistic: ["Average"] dimensions: - name: InstanceId value: i-0686946e22cf9494a
用户可以配置 |
|
|
|
如果收集了标签(对于指定了 |
|
如果用户确切地知道要收集的 CloudWatch 度量指标,则可以使用此配置格式。需要指定 |
示例 2编辑
- module: aws period: 300s metricsets: - cloudwatch metrics: - namespace: "*"
使用此配置,将从 CloudWatch 收集来自所有命名空间的度量指标。此处的限制是所有命名空间的收集周期都设置为相同,在本例中为 300 秒。这将导致 API 调用产生额外成本或数据丢失。例如,来自命名空间 AWS/Usage 的度量指标每分钟发送到 CloudWatch。如果收集周期等于 300 秒,则介于两者之间的数据点将丢失。来自命名空间 AWS/Billing 的度量指标每隔几个小时发送到 CloudWatch。通过每 300 秒从 AWS/Billing 命名空间查询,将产生额外成本。
示例 3编辑
根据配置和 AWS 帐户中的服务数量,API 调用的数量可能太大,从而导致 API 成本过高。为了减少 API 调用的数量,建议用户使用以下配置作为示例。
- metrics.name: 只收集对您的使用案例有用的度量指标子列表。
- metrics.statistic: 默认情况下,CloudWatch 度量集将进行 API 调用以获取所有统计数据,例如平均值、最大值、最小值、总和等。如果用户知道哪种统计方法最有用,请在配置中指定它。
-
metrics.dimensions: 不同的 AWS 服务在 CloudWatch 度量指标中报告不同的维度。例如,EMR 度量指标 可以具有
JobFlowId
维度或JobId
维度。如果用户知道哪个特定维度有用,则可以在此配置选项中指定它。
- module: aws period: 5m metricsets: - cloudwatch regions: us-east-1 metrics: - namespace: AWS/ElasticMapReduce name: ["S3BytesWritten", "S3BytesRead", "HDFSUtilization", "TotalLoad"] resource_type: elasticmapreduce statistic: ["Average"] dimensions: - name: JobId value: "*"
更多示例编辑
使用以下配置,用户将能够从 EBS、ELB 和 EC2 收集 CloudWatch 度量指标,而无需标签信息。
- module: aws period: 300s metricsets: - cloudwatch metrics: - namespace: AWS/EBS - namespace: AWS/ELB - namespace: AWS/EC2
使用以下配置,用户将能够从 EBS、ELB 和 EC2 收集 CloudWatch 度量指标以及来自这些服务的标签。
- module: aws period: 300s metricsets: - cloudwatch metrics: - namespace: AWS/EBS resource_type: ebs - namespace: AWS/ELB resource_type: elasticloadbalancing - namespace: AWS/EC2 resource_type: ec2:instance
使用以下配置,用户将能够收集特定的 CloudWatch 度量指标。例如,来自 EC2 实例 i-123 的 CPUUtilization 度量指标(平均值)和来自 EC2 实例 i-456 的 NetworkIn 度量指标(平均值)。
- module: aws period: 300s metricsets: - cloudwatch metrics: - namespace: AWS/EC2 name: ["CPUUtilization"] resource_type: ec2:instance dimensions: - name: InstanceId value: i-123 statistic: ["Average"] - namespace: AWS/EC2 name: ["NetworkIn"] dimensions: - name: InstanceId value: i-456 statistic: ["Average"]
使用以下配置,用户可以仅筛选出具有度量指标名称 UnHealthyHostCount
的 LoadBalacer
和 TargetGroup
维度度量指标,LoadBalacer
和 TargetGroup
值可以是任何值。
- module: aws period: 300s metricsets: - cloudwatch metrics: - namespace: AWS/ApplicationELB statistic: ['Maximum'] name: ['UnHealthyHostCount'] dimensions: - name: LoadBalancer value: "*" - name: TargetGroup value: "*" resource_type: elasticloadbalancing
这是一个默认的度量集。如果未配置主机模块,则默认情况下启用此度量集。
有关度量集中每个字段的说明,请参阅 导出字段 部分。
这是一个由此度量集生成的示例文档
{ "@timestamp": "2017-10-12T08:05:34.853Z", "aws": { "cloudwatch": { "namespace": "AWS/RDS" }, "dimensions": { "DBClusterIdentifier": "database-1", "Role": "READER" }, "rds": { "metrics": { "AbortedClients": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "ActiveTransactions": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "AuroraBinlogReplicaLag": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "AuroraReplicaLag": { "avg": 18.4158, "count": 5, "max": 23.787, "min": 10.634, "sum": 92.07900000000001 }, "AuroraVolumeBytesLeftTotal": { "avg": 70007366615040, "count": 5, "max": 70007366615040, "min": 70007366615040, "sum": 350036833075200 }, "Aurora_pq_request_attempted": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "Aurora_pq_request_executed": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "Aurora_pq_request_failed": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "Aurora_pq_request_in_progress": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "Aurora_pq_request_not_chosen": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "Aurora_pq_request_not_chosen_below_min_rows": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "Aurora_pq_request_not_chosen_few_pages_outside_buffer_pool": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "Aurora_pq_request_not_chosen_long_trx": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "Aurora_pq_request_not_chosen_pq_high_buffer_pool_pct": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "Aurora_pq_request_not_chosen_small_table": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "Aurora_pq_request_not_chosen_unsupported_access": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "Aurora_pq_request_throttled": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "BlockedTransactions": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "BufferCacheHitRatio": { "avg": 100, "count": 5, "max": 100, "min": 100, "sum": 500 }, "CPUUtilization": { "avg": 6.051666111792592, "count": 5, "max": 6.216563057282379, "min": 5.808333333333334, "sum": 30.25833055896296 }, "CommitLatency": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "CommitThroughput": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "ConnectionAttempts": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "DDLLatency": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "DDLThroughput": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "DMLLatency": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "DMLThroughput": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "DatabaseConnections": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "Deadlocks": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "DeleteLatency": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "DeleteThroughput": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "EBSByteBalance%": { "avg": 99, "count": 1, "max": 99, "min": 99, "sum": 99 }, "EBSIOBalance%": { "avg": 99, "count": 1, "max": 99, "min": 99, "sum": 99 }, "EngineUptime": { "avg": 20800826, "count": 5, "max": 20800946, "min": 20800706, "sum": 104004130 }, "FreeLocalStorage": { "avg": 29682751078.4, "count": 5, "max": 29682819072, "min": 29682675712, "sum": 148413755392 }, "FreeableMemory": { "avg": 4639068160, "count": 5, "max": 4639838208, "min": 4638638080, "sum": 23195340800 }, "InsertLatency": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "InsertThroughput": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "LoginFailures": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "NetworkReceiveThroughput": { "avg": 0.8399323667305664, "count": 5, "max": 1.399556807011113, "min": 0.6999533364442371, "sum": 4.199661833652832 }, "NetworkThroughput": { "avg": 1.6798647334611327, "count": 5, "max": 2.799113614022226, "min": 1.3999066728884741, "sum": 8.399323667305664 }, "NetworkTransmitThroughput": { "avg": 0.8399323667305664, "count": 5, "max": 1.399556807011113, "min": 0.6999533364442371, "sum": 4.199661833652832 }, "NumBinaryLogFiles": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "Queries": { "avg": 6.3836833181909265, "count": 5, "max": 6.53289780681288, "min": 6.184260972479205, "sum": 31.91841659095463 }, "ReadLatency": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "ResultSetCacheHitRatio": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "RollbackSegmentHistoryListLength": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "RowLockTime": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "SelectLatency": { "avg": 0.2519199153394592, "count": 5, "max": 0.2609050632911392, "min": 0.24367924528301885, "sum": 1.2595995766972958 }, "SelectThroughput": { "avg": 2.6002296989354514, "count": 5, "max": 2.650618477644784, "min": 2.5335866920025336, "sum": 13.001148494677256 }, "SumBinaryLogSize": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "UpdateLatency": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "UpdateThroughput": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 }, "WriteLatency": { "avg": 0, "count": 5, "max": 0, "min": 0, "sum": 0 } } } }, "cloud": { "account": { "id": "428152502467", "name": "elastic-beats" }, "provider": "aws", "region": "eu-west-1" }, "event": { "dataset": "aws.cloudwatch", "duration": 115000, "module": "aws" }, "metricset": { "name": "cloudwatch", "period": 10000 }, "service": { "type": "aws" } }