› ›

如何编写脚本

如何编写脚本编辑

在 Elasticsearch API 支持脚本的任何地方，语法都遵循相同的模式；您指定脚本的语言，提供脚本逻辑（或源代码），并添加传递到脚本中的参数。

  "script": {
    "lang":   "...",
    "source" | "id": "...",
    "params": { ... }
  }

lang: 指定脚本编写的语言。默认值为 painless。
source, id: 脚本本身，您可以将其指定为内联脚本的 source 或存储脚本的 id。使用存储脚本 API 创建和管理存储脚本。
params: 指定作为变量传递到脚本中的任何命名参数。使用参数而不是硬编码值来减少编译时间。

编写您的第一个脚本编辑

Painless 是 Elasticsearch 的默认脚本语言。它安全、高效，并为任何具有一定编码经验的人提供了一种自然的语法。

Painless 脚本的结构为一个或多个语句，并且可以选择在开头包含一个或多个用户定义的函数。脚本必须始终至少包含一个语句。

Painless 执行 API 提供了使用简单的用户定义参数测试脚本并接收结果的能力。让我们从一个完整的脚本开始，并回顾其组成部分。

首先，索引一个具有单个字段的文档，以便我们有一些数据可以使用

response = client.index(
  index: 'my-index-000001',
  id: 1,
  body: {
    my_field: 5
  }
)
puts response

PUT my-index-000001/_doc/1
{
  "my_field": 5
}

然后，我们可以构建一个对该字段进行操作的脚本，并在查询的一部分中运行评估该脚本。以下查询使用搜索 API 的 script_fields 参数来检索脚本评估结果。这里发生了很多事情，但我们将分解其组件以分别理解它们。现在，您只需要了解此脚本接受 my_field 并对其进行操作。

response = client.search(
  index: 'my-index-000001',
  body: {
    script_fields: {
      my_doubled_field: {
        script: {
          source: "doc['my_field'].value * params['multiplier']",
          params: {
            multiplier: 2
          }
        }
      }
    }
  }
)
puts response

GET my-index-000001/_search
{
  "script_fields": {
    "my_doubled_field": {
      "script": { 
        "source": "doc['my_field'].value * params['multiplier']", 
        "params": {
          "multiplier": 2
        }
      }
    }
  }
}

	`script` 对象
	`script` 源代码

该 script 是一个标准的 JSON 对象，它在 Elasticsearch 中大多数 API 下定义脚本。此对象需要 source 来定义脚本本身。该脚本没有指定语言，因此它默认为 Painless。

在脚本中使用参数编辑

Elasticsearch 第一次看到一个新脚本时，它会编译该脚本并将编译后的版本存储在缓存中。编译可能是一个繁重的过程。与其在脚本中硬编码值，不如将它们作为命名 params 传递。

例如，在前面的脚本中，我们可以简单地硬编码值并编写一个看似不太复杂的脚本。我们可以简单地检索 my_field 的第一个值，然后将其乘以 2

"source": "return doc['my_field'].value * 2"

虽然它有效，但此解决方案非常不灵活。我们必须修改脚本源代码以更改乘数，并且 Elasticsearch 必须在每次乘数更改时重新编译脚本。

与其硬编码值，不如使用命名 params 来使脚本灵活，并减少脚本运行时的编译时间。现在，您可以更改 multiplier 参数，而无需 Elasticsearch 重新编译脚本。

"source": "doc['my_field'].value * params['multiplier']",
"params": {
  "multiplier": 2
}

默认情况下，您最多可以每 5 分钟编译 150 个脚本。对于摄取上下文，默认脚本编译速率不受限制。

script.context.field.max_compilations_rate=100/10m

如果您在短时间内编译了太多唯一的脚本，Elasticsearch 会使用 circuit_breaking_exception 错误拒绝新的动态脚本。

缩短您的脚本编辑

使用 Painless 本身的语法功能，您可以减少脚本中的冗长性并使其更短。这是一个我们可以使其更短的简单脚本

response = client.search(
  index: 'my-index-000001',
  body: {
    script_fields: {
      my_doubled_field: {
        script: {
          lang: 'painless',
          source: "doc['my_field'].value * params.get('multiplier');",
          params: {
            multiplier: 2
          }
        }
      }
    }
  }
)
puts response

GET my-index-000001/_search
{
  "script_fields": {
    "my_doubled_field": {
      "script": {
        "lang":   "painless",
        "source": "doc['my_field'].value * params.get('multiplier');",
        "params": {
          "multiplier": 2
        }
      }
    }
  }
}

让我们看一下脚本的缩短版本，看看它相对于之前的迭代有哪些改进

GET my-index-000001/_search
{
  "script_fields": {
    "my_doubled_field": {
      "script": {
        "source": "field('my_field').get(null) * params['multiplier']",
        "params": {
          "multiplier": 2
        }
      }
    }
  }
}

此版本的脚本删除了几个组件，并显着简化了语法

该 lang 声明。由于 Painless 是默认语言，因此如果您正在编写 Painless 脚本，则无需指定语言。
该 return 关键字。Painless 会自动使用脚本中的最后一个语句（如果可能）在需要返回值的脚本上下文中生成返回值。
该 get 方法，它被方括号 [] 替换。Painless 使用专门针对 Map 类型的快捷方式，允许我们使用方括号而不是更长的 get 方法。
该 source 语句末尾的分号。Painless 不需要块的最后一个语句的分号。但是，它在其他情况下需要它们来消除歧义。

在 Elasticsearch 支持脚本的任何地方使用此简写语法，例如，当您创建运行时字段时。

存储和检索脚本编辑

您可以使用存储脚本 API 从集群状态存储和检索脚本。存储脚本减少了编译时间并使搜索更快。

与普通脚本不同，存储脚本要求您使用 lang 参数指定脚本语言。

要创建脚本，请使用创建存储脚本 API。例如，以下请求创建一个名为 calculate-score 的存储脚本。

response = client.put_script(
  id: 'calculate-score',
  body: {
    script: {
      lang: 'painless',
      source: "Math.log(_score * 2) + params['my_modifier']"
    }
  }
)
puts response

POST _scripts/calculate-score
{
  "script": {
    "lang": "painless",
    "source": "Math.log(_score * 2) + params['my_modifier']"
  }
}

您可以使用获取存储脚本 API 检索该脚本。

response = client.get_script(
  id: 'calculate-score'
)
puts response

GET _scripts/calculate-score

要在查询中使用存储脚本，请在 script 声明中包含脚本 id

response = client.search(
  index: 'my-index-000001',
  body: {
    query: {
      script_score: {
        query: {
          match: {
            message: 'some message'
          }
        },
        script: {
          id: 'calculate-score',
          params: {
            my_modifier: 2
          }
        }
      }
    }
  }
)
puts response

GET my-index-000001/_search
{
  "query": {
    "script_score": {
      "query": {
        "match": {
            "message": "some message"
        }
      },
      "script": {
        "id": "calculate-score", 
        "params": {
          "my_modifier": 2
        }
      }
    }
  }
}

id 存储脚本的

要删除存储脚本，请提交删除存储脚本 API 请求。

response = client.delete_script(
  id: 'calculate-score'
)
puts response

DELETE _scripts/calculate-score

使用脚本更新文档编辑

您可以使用更新 API 使用指定的脚本更新文档。该脚本可以更新、删除或跳过修改文档。更新 API 还支持传递部分文档，该文档将合并到现有文档中。

首先，让我们索引一个简单的文档

response = client.index(
  index: 'my-index-000001',
  id: 1,
  body: {
    counter: 1,
    tags: [
      'red'
    ]
  }
)
puts response

PUT my-index-000001/_doc/1
{
  "counter" : 1,
  "tags" : ["red"]
}

要递增计数器，您可以使用以下脚本提交更新请求

response = client.update(
  index: 'my-index-000001',
  id: 1,
  body: {
    script: {
      source: 'ctx._source.counter += params.count',
      lang: 'painless',
      params: {
        count: 4
      }
    }
  }
)
puts response

POST my-index-000001/_update/1
{
  "script" : {
    "source": "ctx._source.counter += params.count",
    "lang": "painless",
    "params" : {
      "count" : 4
    }
  }
}

类似地，您可以使用更新脚本将标签添加到标签列表中。因为这只是一个列表，所以即使标签存在，也会添加该标签

response = client.update(
  index: 'my-index-000001',
  id: 1,
  body: {
    script: {
      source: "ctx._source.tags.add(params['tag'])",
      lang: 'painless',
      params: {
        tag: 'blue'
      }
    }
  }
)
puts response

POST my-index-000001/_update/1
{
  "script": {
    "source": "ctx._source.tags.add(params['tag'])",
    "lang": "painless",
    "params": {
      "tag": "blue"
    }
  }
}

您还可以从标签列表中删除标签。Java List 的 remove 方法在 Painless 中可用。它接受要删除的元素的索引。为了避免可能的运行时错误，您首先需要确保标签存在。如果列表包含标签的重复项，此脚本只删除一个出现。

response = client.update(
  index: 'my-index-000001',
  id: 1,
  body: {
    script: {
      source: "if (ctx._source.tags.contains(params['tag'])) { ctx._source.tags.remove(ctx._source.tags.indexOf(params['tag'])) }",
      lang: 'painless',
      params: {
        tag: 'blue'
      }
    }
  }
)
puts response

POST my-index-000001/_update/1
{
  "script": {
    "source": "if (ctx._source.tags.contains(params['tag'])) { ctx._source.tags.remove(ctx._source.tags.indexOf(params['tag'])) }",
    "lang": "painless",
    "params": {
      "tag": "blue"
    }
  }
}

您还可以向文档添加和删除字段。例如，此脚本添加了字段 new_field

response = client.update(
  index: 'my-index-000001',
  id: 1,
  body: {
    script: "ctx._source.new_field = 'value_of_new_field'"
  }
)
puts response

POST my-index-000001/_update/1
{
  "script" : "ctx._source.new_field = 'value_of_new_field'"
}

相反，此脚本删除了字段 new_field

response = client.update(
  index: 'my-index-000001',
  id: 1,
  body: {
    script: "ctx._source.remove('new_field')"
  }
)
puts response

POST my-index-000001/_update/1
{
  "script" : "ctx._source.remove('new_field')"
}

除了更新文档之外，您还可以更改从脚本内部执行的操作。例如，此请求如果 tags 字段包含 green，则删除文档。否则，它什么也不做 (noop)

response = client.update(
  index: 'my-index-000001',
  id: 1,
  body: {
    script: {
      source: "if (ctx._source.tags.contains(params['tag'])) { ctx.op = 'delete' } else { ctx.op = 'none' }",
      lang: 'painless',
      params: {
        tag: 'green'
      }
    }
  }
)
puts response

POST my-index-000001/_update/1
{
  "script": {
    "source": "if (ctx._source.tags.contains(params['tag'])) { ctx.op = 'delete' } else { ctx.op = 'none' }",
    "lang": "painless",
    "params": {
      "tag": "green"
    }
  }
}

« Painless 脚本语言脚本、缓存和搜索速度 »