设置时间序列数据流 (TSDS)

编辑

要设置 时间序列数据流 (TSDS),请按照以下步骤操作

先决条件

编辑
  • 在创建 TSDS 之前,您应该熟悉数据流TSDS 概念
  • 要遵循本教程,您必须具有以下权限

    • 集群权限manage_ilmmanage_index_templates
    • 索引权限:对您创建或转换的任何 TSDS 具有 create_doccreate_index。要滚动更新 TSDS,您必须具有 manage 权限。

创建索引生命周期策略

编辑

虽然是可选的,但我们建议使用 ILM 来自动管理 TSDS 的后备索引。ILM 需要索引生命周期策略。

我们建议您为策略中的 rollover 操作指定 max_age 条件。这确保了 TSDS 的后备索引的 @timestamp 范围是一致的。例如,为 rollover 操作设置 max_age1d 可确保您的后备索引始终包含一天的数据。

resp = client.ilm.put_lifecycle(
    name="my-weather-sensor-lifecycle-policy",
    policy={
        "phases": {
            "hot": {
                "actions": {
                    "rollover": {
                        "max_age": "1d",
                        "max_primary_shard_size": "50gb"
                    }
                }
            },
            "warm": {
                "min_age": "30d",
                "actions": {
                    "shrink": {
                        "number_of_shards": 1
                    },
                    "forcemerge": {
                        "max_num_segments": 1
                    }
                }
            },
            "cold": {
                "min_age": "60d",
                "actions": {
                    "searchable_snapshot": {
                        "snapshot_repository": "found-snapshots"
                    }
                }
            },
            "frozen": {
                "min_age": "90d",
                "actions": {
                    "searchable_snapshot": {
                        "snapshot_repository": "found-snapshots"
                    }
                }
            },
            "delete": {
                "min_age": "735d",
                "actions": {
                    "delete": {}
                }
            }
        }
    },
)
print(resp)
const response = await client.ilm.putLifecycle({
  name: "my-weather-sensor-lifecycle-policy",
  policy: {
    phases: {
      hot: {
        actions: {
          rollover: {
            max_age: "1d",
            max_primary_shard_size: "50gb",
          },
        },
      },
      warm: {
        min_age: "30d",
        actions: {
          shrink: {
            number_of_shards: 1,
          },
          forcemerge: {
            max_num_segments: 1,
          },
        },
      },
      cold: {
        min_age: "60d",
        actions: {
          searchable_snapshot: {
            snapshot_repository: "found-snapshots",
          },
        },
      },
      frozen: {
        min_age: "90d",
        actions: {
          searchable_snapshot: {
            snapshot_repository: "found-snapshots",
          },
        },
      },
      delete: {
        min_age: "735d",
        actions: {
          delete: {},
        },
      },
    },
  },
});
console.log(response);
PUT _ilm/policy/my-weather-sensor-lifecycle-policy
{
  "policy": {
    "phases": {
      "hot": {
        "actions": {
          "rollover": {
            "max_age": "1d",
            "max_primary_shard_size": "50gb"
          }
        }
      },
      "warm": {
        "min_age": "30d",
        "actions": {
          "shrink": {
            "number_of_shards": 1
          },
          "forcemerge": {
            "max_num_segments": 1
          }
        }
      },
      "cold": {
        "min_age": "60d",
        "actions": {
          "searchable_snapshot": {
            "snapshot_repository": "found-snapshots"
          }
        }
      },
      "frozen": {
        "min_age": "90d",
        "actions": {
          "searchable_snapshot": {
            "snapshot_repository": "found-snapshots"
          }
        }
      },
      "delete": {
        "min_age": "735d",
        "actions": {
          "delete": {}
        }
      }
    }
  }
}

创建索引模板

编辑

要设置 TSDS,请创建具有以下详细信息的索引模板

  • 一个或多个与 TSDS 名称匹配的索引模式。我们建议使用我们的 数据流命名方案
  • 启用数据流。
  • 指定定义维度和指标的映射

    • 一个或多个 维度字段,其 time_series_dimension 值为 true。或者,一个或多个配置为维度容器的传递字段,前提是它们将包含至少一个子字段(静态或动态映射)。
    • 一个或多个指标字段,使用 time_series_metric 映射参数标记。
    • 可选:@timestamp 字段的 datedate_nanos 映射。如果您没有指定映射,Elasticsearch 会将 @timestamp 映射为具有默认选项的 date 字段。
  • 定义索引设置

    • index.mode 设置为 time_series
    • 您的生命周期策略在 index.lifecycle.name 索引设置中。
    • 可选:其他索引设置,例如 index.number_of_replicas,用于 TSDS 的后备索引。
  • 优先级高于 200,以避免与内置模板冲突。请参阅避免索引模式冲突
  • 可选:包含您的映射和其他索引设置的组件模板。
resp = client.indices.put_index_template(
    name="my-weather-sensor-index-template",
    index_patterns=[
        "metrics-weather_sensors-*"
    ],
    data_stream={},
    template={
        "settings": {
            "index.mode": "time_series",
            "index.lifecycle.name": "my-lifecycle-policy"
        },
        "mappings": {
            "properties": {
                "sensor_id": {
                    "type": "keyword",
                    "time_series_dimension": True
                },
                "location": {
                    "type": "keyword",
                    "time_series_dimension": True
                },
                "temperature": {
                    "type": "half_float",
                    "time_series_metric": "gauge"
                },
                "humidity": {
                    "type": "half_float",
                    "time_series_metric": "gauge"
                },
                "@timestamp": {
                    "type": "date"
                }
            }
        }
    },
    priority=500,
    meta={
        "description": "Template for my weather sensor data"
    },
)
print(resp)
response = client.indices.put_index_template(
  name: 'my-weather-sensor-index-template',
  body: {
    index_patterns: [
      'metrics-weather_sensors-*'
    ],
    data_stream: {},
    template: {
      settings: {
        'index.mode' => 'time_series',
        'index.lifecycle.name' => 'my-lifecycle-policy'
      },
      mappings: {
        properties: {
          sensor_id: {
            type: 'keyword',
            time_series_dimension: true
          },
          location: {
            type: 'keyword',
            time_series_dimension: true
          },
          temperature: {
            type: 'half_float',
            time_series_metric: 'gauge'
          },
          humidity: {
            type: 'half_float',
            time_series_metric: 'gauge'
          },
          "@timestamp": {
            type: 'date'
          }
        }
      }
    },
    priority: 500,
    _meta: {
      description: 'Template for my weather sensor data'
    }
  }
)
puts response
const response = await client.indices.putIndexTemplate({
  name: "my-weather-sensor-index-template",
  index_patterns: ["metrics-weather_sensors-*"],
  data_stream: {},
  template: {
    settings: {
      "index.mode": "time_series",
      "index.lifecycle.name": "my-lifecycle-policy",
    },
    mappings: {
      properties: {
        sensor_id: {
          type: "keyword",
          time_series_dimension: true,
        },
        location: {
          type: "keyword",
          time_series_dimension: true,
        },
        temperature: {
          type: "half_float",
          time_series_metric: "gauge",
        },
        humidity: {
          type: "half_float",
          time_series_metric: "gauge",
        },
        "@timestamp": {
          type: "date",
        },
      },
    },
  },
  priority: 500,
  _meta: {
    description: "Template for my weather sensor data",
  },
});
console.log(response);
PUT _index_template/my-weather-sensor-index-template
{
  "index_patterns": ["metrics-weather_sensors-*"],
  "data_stream": { },
  "template": {
    "settings": {
      "index.mode": "time_series",
      "index.lifecycle.name": "my-lifecycle-policy"
    },
    "mappings": {
      "properties": {
        "sensor_id": {
          "type": "keyword",
          "time_series_dimension": true
        },
        "location": {
          "type": "keyword",
          "time_series_dimension": true
        },
        "temperature": {
          "type": "half_float",
          "time_series_metric": "gauge"
        },
        "humidity": {
          "type": "half_float",
          "time_series_metric": "gauge"
        },
        "@timestamp": {
          "type": "date"
        }
      }
    }
  },
  "priority": 500,
  "_meta": {
    "description": "Template for my weather sensor data"
  }
}

创建 TSDS

编辑

索引请求将文档添加到 TSDS。TSDS 中的文档必须包含

  • 一个 @timestamp 字段
  • 一个或多个维度字段。至少一个维度必须与 index.routing_path 索引设置匹配(如果已指定)。如果未显式指定,则 index.routing_path 会自动设置为任何将 time_series_dimension 设置为 true 的映射。

要自动创建 TSDS,请提交针对 TSDS 名称的索引请求。此名称必须与您的索引模板的索引模式之一匹配。

要测试以下示例,请将时间戳更新为当前时间的三小时内。添加到 TSDS 的数据必须始终在可接受的时间范围内。

resp = client.bulk(
    index="metrics-weather_sensors-dev",
    operations=[
        {
            "create": {}
        },
        {
            "@timestamp": "2099-05-06T16:21:15.000Z",
            "sensor_id": "HAL-000001",
            "location": "plains",
            "temperature": 26.7,
            "humidity": 49.9
        },
        {
            "create": {}
        },
        {
            "@timestamp": "2099-05-06T16:25:42.000Z",
            "sensor_id": "SYKENET-000001",
            "location": "swamp",
            "temperature": 32.4,
            "humidity": 88.9
        }
    ],
)
print(resp)

resp1 = client.index(
    index="metrics-weather_sensors-dev",
    document={
        "@timestamp": "2099-05-06T16:21:15.000Z",
        "sensor_id": "SYKENET-000001",
        "location": "swamp",
        "temperature": 32.4,
        "humidity": 88.9
    },
)
print(resp1)
const response = await client.bulk({
  index: "metrics-weather_sensors-dev",
  operations: [
    {
      create: {},
    },
    {
      "@timestamp": "2099-05-06T16:21:15.000Z",
      sensor_id: "HAL-000001",
      location: "plains",
      temperature: 26.7,
      humidity: 49.9,
    },
    {
      create: {},
    },
    {
      "@timestamp": "2099-05-06T16:25:42.000Z",
      sensor_id: "SYKENET-000001",
      location: "swamp",
      temperature: 32.4,
      humidity: 88.9,
    },
  ],
});
console.log(response);

const response1 = await client.index({
  index: "metrics-weather_sensors-dev",
  document: {
    "@timestamp": "2099-05-06T16:21:15.000Z",
    sensor_id: "SYKENET-000001",
    location: "swamp",
    temperature: 32.4,
    humidity: 88.9,
  },
});
console.log(response1);
PUT metrics-weather_sensors-dev/_bulk
{ "create":{ } }
{ "@timestamp": "2099-05-06T16:21:15.000Z", "sensor_id": "HAL-000001", "location": "plains", "temperature": 26.7,"humidity": 49.9 }
{ "create":{ } }
{ "@timestamp": "2099-05-06T16:25:42.000Z", "sensor_id": "SYKENET-000001", "location": "swamp", "temperature": 32.4, "humidity": 88.9 }

POST metrics-weather_sensors-dev/_doc
{
  "@timestamp": "2099-05-06T16:21:15.000Z",
  "sensor_id": "SYKENET-000001",
  "location": "swamp",
  "temperature": 32.4,
  "humidity": 88.9
}

您还可以使用 创建数据流 API 手动创建 TSDS。TSDS 的名称仍然必须与您的模板的索引模式之一匹配。

resp = client.indices.create_data_stream(
    name="metrics-weather_sensors-dev",
)
print(resp)
response = client.indices.create_data_stream(
  name: 'metrics-weather_sensors-dev'
)
puts response
const response = await client.indices.createDataStream({
  name: "metrics-weather_sensors-dev",
});
console.log(response);
PUT _data_stream/metrics-weather_sensors-dev

保护 TSDS

编辑

使用 索引权限来控制对 TSDS 的访问。授予 TSDS 上的权限会授予其后备索引上的相同权限。

有关示例,请参阅数据流权限

将现有数据流转换为 TSDS

编辑

您还可以使用上述步骤将现有的常规数据流转换为 TSDS。在这种情况下,您需要

  • 编辑您现有的索引生命周期策略、组件模板和索引模板,而不是创建新的。
  • 手动滚动更新其写入索引,而不是创建 TSDS。这确保了当前写入索引和任何新的后备索引都具有 index.modetime_series

    您可以使用 滚动更新 API 手动滚动更新写入索引。

    resp = client.indices.rollover(
        alias="metrics-weather_sensors-dev",
    )
    print(resp)
    response = client.indices.rollover(
      alias: 'metrics-weather_sensors-dev'
    )
    puts response
    const response = await client.indices.rollover({
      alias: "metrics-weather_sensors-dev",
    });
    console.log(response);
    POST metrics-weather_sensors-dev/_rollover

关于组件模板和 index.mode 设置的说明

编辑

通过使用组件模板的索引模板配置 TSDS 会更复杂一些。通常,使用组件模板时,映射和设置会分散在多个组件模板中。如果定义了 index.routing_path,则它引用的字段需要在同一组件模板中使用启用的 time_series_dimension 属性进行定义。

这样做的原因是每个组件模板都需要单独有效。在索引模板中配置 index.mode 设置时,会自动配置 index.routing_path 设置。它从启用了 time_series_dimension 属性的字段映射派生而来。

下一步是什么?

编辑

现在您已经设置了 TSDS,您可以像使用常规数据流一样管理和使用它。有关更多信息,请参阅