max_chunking_size: （可选，整数）指定块的最大大小（以单词为单位）。默认为 250。此值不能高于 300 或低于 20（对于 sentence 策略）或 10（对于 word 策略）。
overlap: （可选，整数）仅用于 word 分块策略。指定块的重叠单词数。默认为 100。此值不能高于 max_chunking_size 的一半。
sentence_overlap: （可选，整数）仅用于 sentence 分块策略。指定块的重叠句子数。它可以是 1 或 0。默认为 1。
strategy: （可选，字符串）指定分块策略。它可以是 sentence 或 word。

service

（必需，字符串）指定任务类型支持的服务类型。在本例中，为 anthropic。

service_settings

（必需，对象）用于安装推理模型的设置。

这些设置特定于 anthropic 服务。

api_key

（必需，字符串）Anthropic API 的有效 API 密钥。

model_id

（必需，字符串）用于推理任务的模型的名称。您可以在 Anthropic 模型中找到支持的模型。

rate_limit

（可选，对象）默认情况下，anthropic 服务将每分钟允许的请求数设置为 50。这有助于最大限度地减少从 Anthropic 返回的速率限制错误数。要修改此值，请在服务设置中设置此对象的 requests_per_minute 设置

"rate_limit": {
    "requests_per_minute": <<number_of_requests>>
}

task_settings

（必需，对象）用于配置推理任务的设置。这些设置特定于您指定的 <task_type>。

completion 任务类型的 task_settings

max_tokens

（必需，整数）停止前要生成的最大令牌数。

temperature

（可选，浮点数）注入到响应中的随机量。

有关支持范围的更多详细信息，请参阅 Anthropic 消息 API。

top_k

（可选，整数）指定仅从每个后续令牌的前 K 个选项中进行采样。

仅建议用于高级用例。通常您只需要使用 temperature。

有关更多详细信息，请参阅 Anthropic 消息 API。

top_p

（可选，浮点数）指定使用 Anthropic 的核采样。

在核采样中，Anthropic 计算每个后续令牌的所有选项的累积分布，并按概率降序排列，一旦达到由 top_p 指定的特定概率，就将其截断。您应该修改 temperature 或 top_p，但不能同时修改两者。

仅建议用于高级用例。通常您只需要使用 temperature。

有关更多详细信息，请参阅 Anthropic 消息 API。

Anthropic 服务示例

编辑

以下示例演示如何创建名为 anthropic_completion 的推理端点以执行 completion 任务类型。

resp = client.inference.put(
    task_type="completion",
    inference_id="anthropic_completion",
    inference_config={
        "service": "anthropic",
        "service_settings": {
            "api_key": "<api_key>",
            "model_id": "<model_id>"
        },
        "task_settings": {
            "max_tokens": 1024
        }
    },
)
print(resp)

const response = await client.inference.put({
  task_type: "completion",
  inference_id: "anthropic_completion",
  inference_config: {
    service: "anthropic",
    service_settings: {
      api_key: "<api_key>",
      model_id: "<model_id>",
    },
    task_settings: {
      max_tokens: 1024,
    },
  },
});
console.log(response);

PUT _inference/completion/anthropic_completion
{
    "service": "anthropic",
    "service_settings": {
        "api_key": "<api_key>",
        "model_id": "<model_id>"
    },
    "task_settings": {
        "max_tokens": 1024
    }
}

« Amazon Bedrock 推理服务 Azure AI studio 推理服务 »