AWS Fargate task_stats 指标集

编辑

AWS Fargate task_stats 指标集编辑

此功能处于测试阶段,可能会发生变化。设计和代码比官方 GA 功能成熟度低,按原样提供,不提供任何保证。测试版功能不受官方 GA 功能支持 SLA 的约束。

awsfargate 模块中的 task_stats 指标集允许用户监控同一 AWS Fargate 任务中的容器。它从两个端点 ${ECS_CONTAINER_METADATA_URI_V4}/task/stats${ECS_CONTAINER_METADATA_URI_V4}/task 获取运行时 CPU 指标、磁盘 I/O 指标、内存指标、网络指标和容器元数据。

配置示例编辑

此指标集应作为 sidecar 运行在同一 AWS Fargate 任务定义中,默认配置文件应该可以正常工作。

- module: awsfargate
  period: 10s
  metricsets:
    - task_stats

使用 AWS Fargate 设置 Metricbeat编辑

本节旨在为用户提供一种 AWS 原生方式,使用 AWS CloudFormation 配置 Fargate 任务定义以运行应用程序容器和 Metricbeat 容器。

将 Elastic Cloud 凭据存储到 AWS Secret Manager编辑

如果用户使用 Elastic Cloud,建议将云 ID 和云身份验证存储到 AWS Secret Manager 中。以下是 AWS CLI 示例

创建密钥 ELASTIC_CLOUD_AUTH

aws --region us-east-1 secretsmanager create-secret --name ELASTIC_CLOUD_AUTH --secret-string XXX

创建密钥 ELASTIC_CLOUD_ID

aws --region us-east-1 secretsmanager create-secret --name ELASTIC_CLOUD_ID --secret-string YYYY

AWS CloudFormation 模板示例编辑

以下是一个 AWS CloudFormation 模板的示例,仅供测试目的。请将其替换为实际应用程序。此模板展示了如何定义新的集群、如何创建包含多个容器(包括 Metricbeat)的任务定义以及如何启动服务。

AWSTemplateFormatVersion: "2010-09-09"
Parameters:
  SubnetID:
    Type: String
  CloudIDArn:
    Type: String
  CloudAuthArn:
    Type: String
  ClusterName:
    Type: String
  RoleName:
    Type: String
  TaskName:
    Type: String
  ServiceName:
    Type: String
  LogGroupName:
    Type: String
Resources:
  Cluster:
    Type: AWS::ECS::Cluster
    Properties:
      ClusterName: !Ref ClusterName
      ClusterSettings:
        - Name: containerInsights
          Value: enabled
  LogGroup:
    Type: AWS::Logs::LogGroup
    Properties:
      LogGroupName: !Ref LogGroupName
  ExecutionRole:
    Type: AWS::IAM::Role
    Properties:
      RoleName: !Ref RoleName
      AssumeRolePolicyDocument:
        Statement:
          - Effect: Allow
            Principal:
              Service: ecs-tasks.amazonaws.com
            Action: sts:AssumeRole
      ManagedPolicyArns:
        - arn:aws:iam::aws:policy/service-role/AmazonECSTaskExecutionRolePolicy
      Policies:
        - PolicyName: !Sub 'EcsTaskExecutionRole-${AWS::StackName}'
          PolicyDocument:
            Version: 2012-10-17
            Statement:
              - Effect: Allow
                Action:
                  - secretsmanager:GetSecretValue
                Resource:
                  - !Ref CloudIDArn
                  - !Ref CloudAuthArn
  TaskDefinition:
    Type: AWS::ECS::TaskDefinition
    Properties:
      Family: !Ref TaskName
      Cpu: 256
      Memory: 512
      NetworkMode: awsvpc
      ExecutionRoleArn: !Ref ExecutionRole
      ContainerDefinitions:
        - Name: metricbeat-container
          Image: docker.elastic.co/beats/metricbeat:8.0.0-SNAPSHOT
          Secrets:
            - Name: ELASTIC_CLOUD_ID
              ValueFrom: !Ref CloudIDArn
            - Name: ELASTIC_CLOUD_AUTH
              ValueFrom: !Ref CloudAuthArn
          LogConfiguration:
            LogDriver: awslogs
            Options:
              awslogs-region: !Ref AWS::Region
              awslogs-group: !Ref LogGroup
              awslogs-stream-prefix: ecs
          EntryPoint:
            - sh
            - -c
          Command:
            - ./metricbeat setup -E cloud.id=$ELASTIC_CLOUD_ID -E cloud.auth=$ELASTIC_CLOUD_AUTH && ./metricbeat modules disable system && ./metricbeat modules enable awsfargate && ./metricbeat -e -E cloud.id=$ELASTIC_CLOUD_ID -E cloud.auth=$ELASTIC_CLOUD_AUTH
        - Name: stress-test
          Image: containerstack/alpine-stress
          Essential: false
          DependsOn:
            - ContainerName: metricbeat-container
              Condition: START
          EntryPoint:
            - sh
            - -c
          Command:
            - stress --cpu 8 --io 4 --vm 2 --vm-bytes 128M --timeout 6000s
      RequiresCompatibilities:
        - EC2
        - FARGATE
  Service:
    Type: AWS::ECS::Service
    Properties:
      ServiceName: !Ref ServiceName
      Cluster: !Ref Cluster
      TaskDefinition: !Ref TaskDefinition
      DesiredCount: 1
      LaunchType: FARGATE
      NetworkConfiguration:
        AwsvpcConfiguration:
          AssignPublicIp: ENABLED
          Subnets:
            - !Ref SubnetID

创建 CloudFormation 堆栈编辑

复制 CloudFormation 模板时,请确保 Metricbeat 容器映像的版本正确。将模板保存在本地名为 clouformation.yml 的文件中后,可以使用 AWS CLI 通过一个命令创建堆栈

aws --region us-east-1 cloudformation create-stack --stack-name <your-stack-name> --template-body file://./cloudformation.yml --capabilities CAPABILITY_NAMED_IAM --parameters ParameterKey=SubnetID,ParameterValue=<subnet-id> ParameterKey=CloudAuthArn,ParameterValue=<cloud-auth-arn> ParameterKey=CloudIDArn,ParameterValue=<cloud-id-arn> ParameterKey=ClusterName,ParameterValue=<cluster-name> ParameterKey=RoleName,ParameterValue=<role-name> ParameterKey=TaskName,ParameterValue=<task-name> ParameterKey=ServiceName,ParameterValue=<service-name> ParameterKey=LogGroupName,ParameterValue=<log-group-name>

请务必将 <subnet-id> 替换为您自己的子网。请转到服务 → VPC → 子网以查找要使用的子网 ID。您还可以将更多容器添加到 TaskDefinition 部分。

删除 CloudFormation 堆栈编辑

以下是用 AWS CLI 删除堆栈(包括集群、任务定义和所有容器)的方法

aws --region us-east-1 cloudformation delete-stack --stack-name <your-stack-name>

仪表板编辑

task_stats 指标集带有一个预定义的仪表板。例如

metricbeat awsfargate overview

这是一个默认的指标集。如果主机模块未配置,则默认情况下会启用此指标集。

字段

有关指标集中每个字段的说明,请参阅导出字段部分。

以下是由此指标集生成的示例文档

{
    "@timestamp": "2017-10-12T08:05:34.853Z",
    "awsfargate": {
        "task_stats": {
            "cluster_name": "default",
            "cpu": {
                "core": null,
                "kernel": {
                    "norm": {
                        "pct": 0
                    },
                    "pct": 0,
                    "ticks": 1520000000
                },
                "system": {
                    "norm": {
                        "pct": 1
                    },
                    "pct": 2,
                    "ticks": 1420180000000
                },
                "total": {
                    "norm": {
                        "pct": 0.2
                    },
                    "pct": 0.4
                },
                "user": {
                    "norm": {
                        "pct": 0
                    },
                    "pct": 0,
                    "ticks": 490000000
                }
            },
            "diskio": {
                "read": {
                    "bytes": 3452928,
                    "ops": 118,
                    "queued": 0,
                    "rate": 0,
                    "service_time": 0,
                    "wait_time": 0
                },
                "reads": 0,
                "summary": {
                    "bytes": 3452928,
                    "ops": 118,
                    "queued": 0,
                    "rate": 0,
                    "service_time": 0,
                    "wait_time": 0
                },
                "total": 0,
                "write": {
                    "bytes": 0,
                    "ops": 0,
                    "queued": 0,
                    "rate": 0,
                    "service_time": 0,
                    "wait_time": 0
                },
                "writes": 0
            },
            "identifier": "query-metadata/1234",
            "memory": {
                "fail": {
                    "count": 0
                },
                "limit": 0,
                "rss": {
                    "pct": 0.0010557805807105247,
                    "total": 4157440
                },
                "stats": {
                    "active_anon": 4157440,
                    "active_file": 4497408,
                    "cache": 6000640,
                    "dirty": 16384,
                    "hierarchical_memory_limit": 2147483648,
                    "hierarchical_memsw_limit": 9223372036854772000,
                    "inactive_anon": 0,
                    "inactive_file": 1503232,
                    "mapped_file": 2183168,
                    "pgfault": 6668,
                    "pgmajfault": 52,
                    "pgpgin": 5925,
                    "pgpgout": 3445,
                    "rss": 4157440,
                    "rss_huge": 0,
                    "total_active_anon": 4157440,
                    "total_active_file": 4497408,
                    "total_cache": 600064,
                    "total_dirty": 16384,
                    "total_inactive_anon": 0,
                    "total_inactive_file": 4497408,
                    "total_mapped_file": 2183168,
                    "total_pgfault": 6668,
                    "total_pgmajfault": 52,
                    "total_pgpgin": 5925,
                    "total_pgpgout": 3445,
                    "total_rss": 4157440,
                    "total_rss_huge": 0,
                    "total_unevictable": 0,
                    "total_writeback": 0,
                    "unevictable": 0,
                    "writeback": 0
                },
                "usage": {
                    "max": 15294464,
                    "total": 12349440
                }
            },
            "network": {
                "eth0": {
                    "inbound": {
                        "bytes": 137315578,
                        "dropped": 0,
                        "errors": 0,
                        "packets": 94338
                    },
                    "outbound": {
                        "bytes": 1086811,
                        "dropped": 0,
                        "errors": 0,
                        "packets": 25857
                    }
                }
            },
            "task_desired_status": "RUNNING",
            "task_known_status": "ACTIVATING",
            "task_name": "query-metadata-1"
        }
    },
    "cloud": {
        "region": "us-west-2"
    },
    "container": {
        "id": "1234",
        "image": {
            "name": "mreferre/eksutils"
        },
        "labels": {
            "com_amazonaws_ecs_cluster": "arn:aws:ecs:us-west-2:111122223333:cluster/default",
            "com_amazonaws_ecs_container-name": "query-metadata",
            "com_amazonaws_ecs_task-arn": "arn:aws:ecs:us-west-2:111122223333:task/default/febee046097849aba589d4435207c04a",
            "com_amazonaws_ecs_task-definition-family": "query-metadata",
            "com_amazonaws_ecs_task-definition-version": "7"
        },
        "name": "query-metadata"
    },
    "service": {
        "type": "awsfargate"
    }
}