常用选项编辑

所有 Elasticsearch REST API 都支持以下选项。

美化结果编辑

在任何请求中追加 ?pretty=true 时,返回的 JSON 将被美化格式化(仅用于调试!)。另一个选项是设置 ?format=yaml,这将导致结果以(有时)更易读的 yaml 格式返回。

人类可读的输出编辑

统计信息以适合人类阅读的格式返回(例如 "exists_time": "1h""size": "1kb")以及适合计算机阅读的格式(例如 "exists_time_in_millis": 3600000"size_in_bytes": 1024)。可以通过在查询字符串中添加 ?human=false 来关闭人类可读的值。当统计结果被监控工具使用而不是供人类阅读时,这样做是有意义的。 human 标志的默认值为 false

日期数学编辑

大多数接受格式化日期值的参数(例如 range 查询 中的 gtlt,或 daterange 聚合 中的 fromto)都理解日期数学。

表达式以锚点日期开头,可以是 now,也可以是结尾为 || 的日期字符串。此锚点日期可以选择性地后跟一个或多个数学表达式

  • +1h: 添加一小时
  • -1d: 减去一天
  • /d: 向下舍入到最近的一天

支持的时间单位与 时间单位 中支持的持续时间不同。支持的单位是

y

M

w

d

h

小时

H

小时

m

分钟

s

假设 now2001-01-01 12:00:00,一些示例是

now+1h

now 以毫秒为单位加上一小时。解析为:2001-01-01 13:00:00

now-1h

now 以毫秒为单位减去一小时。解析为:2001-01-01 11:00:00

now-1h/d

now 以毫秒为单位减去一小时,向下舍入到 UTC 00:00。解析为:2001-01-01 00:00:00

2001.02.01\|\|+1M/d

2001-02-01 以毫秒为单位加上一个月。解析为:2001-03-01 00:00:00

响应过滤编辑

所有 REST API 都接受一个 filter_path 参数,该参数可用于减少 Elasticsearch 返回的响应。此参数采用以逗号分隔的过滤器列表,这些过滤器以点表示法表示

response = client.search(
  q: 'kimchy',
  filter_path: 'took,hits.hits._id,hits.hits._score'
)
puts response
GET /_search?q=kimchy&filter_path=took,hits.hits._id,hits.hits._score

响应

{
  "took" : 3,
  "hits" : {
    "hits" : [
      {
        "_id" : "0",
        "_score" : 1.6375021
      }
    ]
  }
}

它还支持 * 通配符来匹配任何字段或字段名称的一部分

$response = $client->cluster()->state();
response = client.cluster.state(
  filter_path: 'metadata.indices.*.stat*'
)
puts response
res, err := es.Cluster.State(
	es.Cluster.State.WithFilterPath("metadata.indices.*.stat*"),
)
fmt.Println(res, err)
const response = await client.cluster.state({
  filter_path: 'metadata.indices.*.stat*'
})
console.log(response)
GET /_cluster/state?filter_path=metadata.indices.*.stat*

响应

{
  "metadata" : {
    "indices" : {
      "my-index-000001": {"state": "open"}
    }
  }
}

以及 ** 通配符可用于包含字段,而无需知道字段的确切路径。例如,我们可以使用此请求返回每个分片的狀態

$response = $client->cluster()->state();
response = client.cluster.state(
  filter_path: 'routing_table.indices.**.state'
)
puts response
res, err := es.Cluster.State(
	es.Cluster.State.WithFilterPath("routing_table.indices.**.state"),
)
fmt.Println(res, err)
const response = await client.cluster.state({
  filter_path: 'routing_table.indices.**.state'
})
console.log(response)
GET /_cluster/state?filter_path=routing_table.indices.**.state

响应

{
  "routing_table": {
    "indices": {
      "my-index-000001": {
        "shards": {
          "0": [{"state": "STARTED"}, {"state": "UNASSIGNED"}]
        }
      }
    }
  }
}

也可以通过在过滤器前缀添加字符 - 来排除一个或多个字段

$response = $client->count();
response = client.count(
  filter_path: '-_shards'
)
puts response
res, err := es.Count(
	es.Count.WithFilterPath("-_shards"),
	es.Count.WithPretty(),
)
fmt.Println(res, err)
const response = await client.count({
  filter_path: '-_shards'
})
console.log(response)
GET /_count?filter_path=-_shards

响应

{
  "count" : 5
}

为了更精确地控制,可以在同一个表达式中组合包含和排除过滤器。在这种情况下,排除过滤器将首先应用,然后使用包含过滤器再次过滤结果

$response = $client->cluster()->state();
response = client.cluster.state(
  filter_path: 'metadata.indices.*.state,-metadata.indices.logstash-*'
)
puts response
res, err := es.Cluster.State(
	es.Cluster.State.WithFilterPath("metadata.indices.*.state,-metadata.indices.logstash-*"),
)
fmt.Println(res, err)
const response = await client.cluster.state({
  filter_path: 'metadata.indices.*.state,-metadata.indices.logstash-*'
})
console.log(response)
GET /_cluster/state?filter_path=metadata.indices.*.state,-metadata.indices.logstash-*

响应

{
  "metadata" : {
    "indices" : {
      "my-index-000001" : {"state" : "open"},
      "my-index-000002" : {"state" : "open"},
      "my-index-000003" : {"state" : "open"}
    }
  }
}

请注意,Elasticsearch 有时会直接返回字段的原始值,例如 _source 字段。如果要过滤 _source 字段,应考虑将现有的 _source 参数(有关更多详细信息,请参见 Get API)与 filter_path 参数结合使用,例如

$params = [
    'index' => 'library',
    'body' => [
        'title' => 'Book #1',
        'rating' => 200.1,
    ],
];
$response = $client->index($params);
$params = [
    'index' => 'library',
    'body' => [
        'title' => 'Book #2',
        'rating' => 1.7,
    ],
];
$response = $client->index($params);
$params = [
    'index' => 'library',
    'body' => [
        'title' => 'Book #3',
        'rating' => 0.1,
    ],
];
$response = $client->index($params);
$response = $client->search();
response = client.index(
  index: 'library',
  refresh: true,
  body: {
    title: 'Book #1',
    rating: 200.1
  }
)
puts response

response = client.index(
  index: 'library',
  refresh: true,
  body: {
    title: 'Book #2',
    rating: 1.7
  }
)
puts response

response = client.index(
  index: 'library',
  refresh: true,
  body: {
    title: 'Book #3',
    rating: 0.1
  }
)
puts response

response = client.search(
  filter_path: 'hits.hits._source',
  _source: 'title',
  sort: 'rating:desc'
)
puts response
{
	res, err := es.Index(
		"library",
		strings.NewReader(`{
	  "title": "Book #1",
	  "rating": 200.1
	}`),
		es.Index.WithRefresh("true"),
		es.Index.WithPretty(),
	)
	fmt.Println(res, err)
}

{
	res, err := es.Index(
		"library",
		strings.NewReader(`{
	  "title": "Book #2",
	  "rating": 1.7
	}`),
		es.Index.WithRefresh("true"),
		es.Index.WithPretty(),
	)
	fmt.Println(res, err)
}

{
	res, err := es.Index(
		"library",
		strings.NewReader(`{
	  "title": "Book #3",
	  "rating": 0.1
	}`),
		es.Index.WithRefresh("true"),
		es.Index.WithPretty(),
	)
	fmt.Println(res, err)
}

{
	res, err := es.Search(
		es.Search.WithSource("title"),
		es.Search.WithFilterPath("hits.hits._source"),
		es.Search.WithSort("rating:desc"),
		es.Search.WithPretty(),
	)
	fmt.Println(res, err)
}
const response0 = await client.index({
  index: 'library',
  refresh: true,
  body: {
    title: 'Book #1',
    rating: 200.1
  }
})
console.log(response0)

const response1 = await client.index({
  index: 'library',
  refresh: true,
  body: {
    title: 'Book #2',
    rating: 1.7
  }
})
console.log(response1)

const response2 = await client.index({
  index: 'library',
  refresh: true,
  body: {
    title: 'Book #3',
    rating: 0.1
  }
})
console.log(response2)

const response3 = await client.search({
  filter_path: 'hits.hits._source',
  _source: 'title',
  sort: 'rating:desc'
})
console.log(response3)
POST /library/_doc?refresh
{"title": "Book #1", "rating": 200.1}
POST /library/_doc?refresh
{"title": "Book #2", "rating": 1.7}
POST /library/_doc?refresh
{"title": "Book #3", "rating": 0.1}
GET /_search?filter_path=hits.hits._source&_source=title&sort=rating:desc
{
  "hits" : {
    "hits" : [ {
      "_source":{"title":"Book #1"}
    }, {
      "_source":{"title":"Book #2"}
    }, {
      "_source":{"title":"Book #3"}
    } ]
  }
}

扁平化设置编辑

flat_settings 标志影响设置列表的呈现方式。当 flat_settings 标志为 true 时,设置将以扁平化格式返回

response = client.indices.get_settings(
  index: 'my-index-000001',
  flat_settings: true
)
puts response
GET my-index-000001/_settings?flat_settings=true

返回

{
  "my-index-000001" : {
    "settings": {
      "index.number_of_replicas": "1",
      "index.number_of_shards": "1",
      "index.creation_date": "1474389951325",
      "index.uuid": "n6gzFZTgS664GUfx0Xrpjw",
      "index.version.created": ...,
      "index.routing.allocation.include._tier_preference" : "data_content",
      "index.provided_name" : "my-index-000001"
    }
  }
}

flat_settings 标志为 false 时,设置将以更易于人类阅读的结构化格式返回

response = client.indices.get_settings(
  index: 'my-index-000001',
  flat_settings: false
)
puts response
GET my-index-000001/_settings?flat_settings=false

返回

{
  "my-index-000001" : {
    "settings" : {
      "index" : {
        "number_of_replicas": "1",
        "number_of_shards": "1",
        "creation_date": "1474389951325",
        "uuid": "n6gzFZTgS664GUfx0Xrpjw",
        "version": {
          "created": ...
        },
        "routing": {
          "allocation": {
            "include": {
              "_tier_preference": "data_content"
            }
          }
        },
        "provided_name" : "my-index-000001"
      }
    }
  }
}

默认情况下,flat_settings 设置为 false

模糊度编辑

某些查询和 API 支持参数以允许使用 fuzziness 参数进行不精确的模糊匹配。

在查询 textkeyword 字段时,fuzziness 被解释为 莱文斯坦编辑距离(将一个字符串更改为另一个字符串所需的单字符更改次数)。

fuzziness 参数可以指定为

0, 1, 2

允许的最大莱文斯坦编辑距离(或编辑次数)

AUTO

根据术语的长度生成编辑距离。可以选择性地提供低距离和高距离参数 AUTO:[low],[high]。如果未指定,则默认值为 3 和 6,等效于 AUTO:3,6,这使得长度为

0..2
必须完全匹配
3..5
允许一个编辑
>5
允许两个编辑

AUTO 通常应该是 fuzziness 的首选值。

启用堆栈跟踪编辑

默认情况下,当请求返回错误时,Elasticsearch 不会包含错误的堆栈跟踪。可以通过将 error_trace url 参数设置为 true 来启用此行为。例如,默认情况下,当您向 _search API 发送无效的 size 参数时

POST /my-index-000001/_search?size=surprise_me

响应看起来像

{
  "error" : {
    "root_cause" : [
      {
        "type" : "illegal_argument_exception",
        "reason" : "Failed to parse int parameter [size] with value [surprise_me]"
      }
    ],
    "type" : "illegal_argument_exception",
    "reason" : "Failed to parse int parameter [size] with value [surprise_me]",
    "caused_by" : {
      "type" : "number_format_exception",
      "reason" : "For input string: \"surprise_me\""
    }
  },
  "status" : 400
}

但是,如果您设置 error_trace=true

POST /my-index-000001/_search?size=surprise_me&error_trace=true

响应看起来像

{
  "error": {
    "root_cause": [
      {
        "type": "illegal_argument_exception",
        "reason": "Failed to parse int parameter [size] with value [surprise_me]",
        "stack_trace": "Failed to parse int parameter [size] with value [surprise_me]]; nested: IllegalArgumentException..."
      }
    ],
    "type": "illegal_argument_exception",
    "reason": "Failed to parse int parameter [size] with value [surprise_me]",
    "stack_trace": "java.lang.IllegalArgumentException: Failed to parse int parameter [size] with value [surprise_me]\n    at org.elasticsearch.rest.RestRequest.paramAsInt(RestRequest.java:175)...",
    "caused_by": {
      "type": "number_format_exception",
      "reason": "For input string: \"surprise_me\"",
      "stack_trace": "java.lang.NumberFormatException: For input string: \"surprise_me\"\n    at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)..."
    }
  },
  "status": 400
}