扁平字段类型
编辑扁平字段类型
编辑默认情况下,对象中的每个子字段都会被单独映射和索引。如果子字段的名称或类型事先未知,则会进行动态映射。
flattened
类型提供了一种替代方法,其中整个对象被映射为单个字段。给定一个对象,flattened
映射将解析出其叶子值,并将它们作为关键字索引到一个字段中。然后可以通过简单的查询和聚合来搜索对象的内容。
这种数据类型对于索引具有大量或未知数量唯一键的对象非常有用。整个 JSON 对象只创建一个字段映射,这有助于防止由于有太多不同的字段映射而导致映射爆炸。
另一方面,扁平对象字段在搜索功能方面存在权衡。仅允许基本查询,不支持数值范围查询或高亮显示。有关限制的更多信息,请参阅支持的操作部分。
flattened
映射类型不应用于索引所有文档内容,因为它将所有值视为关键字,并且不提供完整的搜索功能。默认方法(每个子字段在映射中都有自己的条目)在大多数情况下都适用。
可以按如下方式创建扁平对象字段
resp = client.indices.create( index="bug_reports", mappings={ "properties": { "title": { "type": "text" }, "labels": { "type": "flattened" } } }, ) print(resp) resp1 = client.index( index="bug_reports", id="1", document={ "title": "Results are not sorted correctly.", "labels": { "priority": "urgent", "release": [ "v1.2.5", "v1.3.0" ], "timestamp": { "created": 1541458026, "closed": 1541457010 } } }, ) print(resp1)
response = client.indices.create( index: 'bug_reports', body: { mappings: { properties: { title: { type: 'text' }, labels: { type: 'flattened' } } } } ) puts response response = client.index( index: 'bug_reports', id: 1, body: { title: 'Results are not sorted correctly.', labels: { priority: 'urgent', release: [ 'v1.2.5', 'v1.3.0' ], timestamp: { created: 1_541_458_026, closed: 1_541_457_010 } } } ) puts response
const response = await client.indices.create({ index: "bug_reports", mappings: { properties: { title: { type: "text", }, labels: { type: "flattened", }, }, }, }); console.log(response); const response1 = await client.index({ index: "bug_reports", id: 1, document: { title: "Results are not sorted correctly.", labels: { priority: "urgent", release: ["v1.2.5", "v1.3.0"], timestamp: { created: 1541458026, closed: 1541457010, }, }, }, }); console.log(response1);
PUT bug_reports { "mappings": { "properties": { "title": { "type": "text" }, "labels": { "type": "flattened" } } } } POST bug_reports/_doc/1 { "title": "Results are not sorted correctly.", "labels": { "priority": "urgent", "release": ["v1.2.5", "v1.3.0"], "timestamp": { "created": 1541458026, "closed": 1541457010 } } }
在索引期间,为 JSON 对象中的每个叶子值创建令牌。这些值将作为字符串关键字进行索引,而不会对数字或日期进行分析或特殊处理。
查询顶级的 flattened
字段会搜索对象中的所有叶子值
resp = client.search( index="bug_reports", query={ "term": { "labels": "urgent" } }, ) print(resp)
response = client.search( index: 'bug_reports', body: { query: { term: { labels: 'urgent' } } } ) puts response
const response = await client.search({ index: "bug_reports", query: { term: { labels: "urgent", }, }, }); console.log(response);
POST bug_reports/_search { "query": { "term": {"labels": "urgent"} } }
要在扁平对象中的特定键上查询,可以使用对象点表示法
resp = client.search( index="bug_reports", query={ "term": { "labels.release": "v1.3.0" } }, ) print(resp)
response = client.search( index: 'bug_reports', body: { query: { term: { 'labels.release' => 'v1.3.0' } } } ) puts response
const response = await client.search({ index: "bug_reports", query: { term: { "labels.release": "v1.3.0", }, }, }); console.log(response);
POST bug_reports/_search { "query": { "term": {"labels.release": "v1.3.0"} } }
支持的操作
编辑由于值索引方式的相似性,flattened
字段与keyword
字段共享许多相同的映射和搜索功能。
目前,扁平对象字段可以与以下查询类型一起使用
-
term
、terms
和terms_set
-
prefix
-
range
-
match
和multi_match
-
query_string
和simple_query_string
-
exists
查询时,无法使用通配符引用字段键,如 { "term": {"labels.time*": 1541457010}}
。 请注意,所有查询(包括 range
)都将值视为字符串关键字。 flattened
字段不支持高亮显示。
可以在扁平对象字段上进行排序,以及执行简单的关键字样式聚合,例如 terms
。与查询一样,没有对数字的特殊支持 — JSON 对象中的所有值都被视为关键字。排序时,这意味着值是按字典顺序比较的。
目前无法存储扁平对象字段。无法在映射中指定store
参数。
检索扁平字段
编辑可以使用fields 参数检索字段值和具体子字段。内容。由于 flattened
字段将整个对象(可能具有许多子字段)映射为单个字段,因此响应包含来自 _source
的未更改结构。
但是,可以通过在请求中明确指定来获取单个子字段。这仅适用于具体路径,而不适用于通配符
resp = client.indices.create( index="my-index-000001", mappings={ "properties": { "flattened_field": { "type": "flattened" } } }, ) print(resp) resp1 = client.index( index="my-index-000001", id="1", refresh=True, document={ "flattened_field": { "subfield": "value" } }, ) print(resp1) resp2 = client.search( index="my-index-000001", fields=[ "flattened_field.subfield" ], source=False, ) print(resp2)
response = client.indices.create( index: 'my-index-000001', body: { mappings: { properties: { flattened_field: { type: 'flattened' } } } } ) puts response response = client.index( index: 'my-index-000001', id: 1, refresh: true, body: { flattened_field: { subfield: 'value' } } ) puts response response = client.search( index: 'my-index-000001', body: { fields: [ 'flattened_field.subfield' ], _source: false } ) puts response
const response = await client.indices.create({ index: "my-index-000001", mappings: { properties: { flattened_field: { type: "flattened", }, }, }, }); console.log(response); const response1 = await client.index({ index: "my-index-000001", id: 1, refresh: "true", document: { flattened_field: { subfield: "value", }, }, }); console.log(response1); const response2 = await client.search({ index: "my-index-000001", fields: ["flattened_field.subfield"], _source: false, }); console.log(response2);
PUT my-index-000001 { "mappings": { "properties": { "flattened_field": { "type": "flattened" } } } } PUT my-index-000001/_doc/1?refresh=true { "flattened_field" : { "subfield" : "value" } } POST my-index-000001/_search { "fields": ["flattened_field.subfield"], "_source": false }
{ "took": 2, "timed_out": false, "_shards": { "total": 1, "successful": 1, "skipped": 0, "failed": 0 }, "hits": { "total": { "value": 1, "relation": "eq" }, "max_score": 1.0, "hits": [{ "_index": "my-index-000001", "_id": "1", "_score": 1.0, "fields": { "flattened_field.subfield" : [ "value" ] } }] } }
您还可以使用 Painless 脚本来检索扁平字段的子字段中的值。在您的 Painless 脚本中,不要包含 doc['<field_name>'].value
,而是使用 doc['<field_name>.<sub-field_name>'].value
。例如,如果您有一个名为 label
的扁平字段,其中有一个 release
子字段,则您的 Painless 脚本将是 doc['labels.release'].value
。
例如,假设您的映射包含两个字段,其中一个字段的类型为 flattened
resp = client.indices.create( index="my-index-000001", mappings={ "properties": { "title": { "type": "text" }, "labels": { "type": "flattened" } } }, ) print(resp)
response = client.indices.create( index: 'my-index-000001', body: { mappings: { properties: { title: { type: 'text' }, labels: { type: 'flattened' } } } } ) puts response
const response = await client.indices.create({ index: "my-index-000001", mappings: { properties: { title: { type: "text", }, labels: { type: "flattened", }, }, }, }); console.log(response);
PUT my-index-000001 { "mappings": { "properties": { "title": { "type": "text" }, "labels": { "type": "flattened" } } } }
索引一些包含已映射字段的文档。labels
字段有三个子字段
resp = client.bulk( index="my-index-000001", refresh=True, operations=[ { "index": {} }, { "title": "Something really urgent", "labels": { "priority": "urgent", "release": [ "v1.2.5", "v1.3.0" ], "timestamp": { "created": 1541458026, "closed": 1541457010 } } }, { "index": {} }, { "title": "Somewhat less urgent", "labels": { "priority": "high", "release": [ "v1.3.0" ], "timestamp": { "created": 1541458026, "closed": 1541457010 } } }, { "index": {} }, { "title": "Not urgent", "labels": { "priority": "low", "release": [ "v1.2.0" ], "timestamp": { "created": 1541458026, "closed": 1541457010 } } } ], ) print(resp)
response = client.bulk( index: 'my-index-000001', refresh: true, body: [ { index: {} }, { title: 'Something really urgent', labels: { priority: 'urgent', release: [ 'v1.2.5', 'v1.3.0' ], timestamp: { created: 1_541_458_026, closed: 1_541_457_010 } } }, { index: {} }, { title: 'Somewhat less urgent', labels: { priority: 'high', release: [ 'v1.3.0' ], timestamp: { created: 1_541_458_026, closed: 1_541_457_010 } } }, { index: {} }, { title: 'Not urgent', labels: { priority: 'low', release: [ 'v1.2.0' ], timestamp: { created: 1_541_458_026, closed: 1_541_457_010 } } } ] ) puts response
const response = await client.bulk({ index: "my-index-000001", refresh: "true", operations: [ { index: {}, }, { title: "Something really urgent", labels: { priority: "urgent", release: ["v1.2.5", "v1.3.0"], timestamp: { created: 1541458026, closed: 1541457010, }, }, }, { index: {}, }, { title: "Somewhat less urgent", labels: { priority: "high", release: ["v1.3.0"], timestamp: { created: 1541458026, closed: 1541457010, }, }, }, { index: {}, }, { title: "Not urgent", labels: { priority: "low", release: ["v1.2.0"], timestamp: { created: 1541458026, closed: 1541457010, }, }, }, ], }); console.log(response);
POST /my-index-000001/_bulk?refresh {"index":{}} {"title":"Something really urgent","labels":{"priority":"urgent","release":["v1.2.5","v1.3.0"],"timestamp":{"created":1541458026,"closed":1541457010}}} {"index":{}} {"title":"Somewhat less urgent","labels":{"priority":"high","release":["v1.3.0"],"timestamp":{"created":1541458026,"closed":1541457010}}} {"index":{}} {"title":"Not urgent","labels":{"priority":"low","release":["v1.2.0"],"timestamp":{"created":1541458026,"closed":1541457010}}}
因为 labels
是 flattened
字段类型,所以整个对象被映射为单个字段。要在 Painless 脚本中从此子字段检索值,请使用 doc['<field_name>.<sub-field_name>'].value
格式。
"script": { "source": """ if (doc['labels.release'].value.equals('v1.3.0')) {emit(doc['labels.release'].value)} else{emit('Version mismatch')} """
扁平对象字段的参数
编辑接受以下映射参数
|
扁平对象字段允许的最大深度,以嵌套内部对象表示。如果扁平对象字段超过此限制,则会抛出错误。默认为 |
是否应以列式方式将字段存储在磁盘上,以便稍后将其用于排序、聚合或脚本?接受 |
|
是否应在刷新时立即加载全局序数?接受 |
|
超过此限制的叶子值将不会被索引。默认情况下,没有限制,所有值都将被索引。请注意,此限制适用于扁平对象字段内的叶子值,而不适用于整个字段的长度。 |
|
确定字段是否应该可搜索。接受 |
|
出于评分目的,应在索引中存储哪些信息。默认为 |
|
一个字符串值,它将替换扁平对象字段中的任何显式 |
|
应该使用哪种评分算法或相似度。默认为 |
|
|
在为此字段构建查询时,全文查询是否应在空格上拆分输入。接受 |
|
(可选,字符串数组)扁平对象内部的字段列表,其中每个字段都是时间序列的维度。每个字段都使用从根字段开始的相对路径指定,并且不包括根字段名称。 |
合成 _source
编辑合成 _source
仅适用于 TSDB 索引(将 index.mode
设置为 time_series
的索引)。对于其他索引,合成 _source
处于技术预览状态。技术预览中的功能可能会在未来的版本中更改或删除。Elastic 将努力解决任何问题,但技术预览中的功能不受官方 GA 功能的支持 SLA 约束。
扁平字段在其默认配置中支持合成`_source`。
合成源可能会对 flattened
字段值进行排序并删除重复项。例如
resp = client.indices.create( index="idx", settings={ "index": { "mapping": { "source": { "mode": "synthetic" } } } }, mappings={ "properties": { "flattened": { "type": "flattened" } } }, ) print(resp) resp1 = client.index( index="idx", id="1", document={ "flattened": { "field": [ "apple", "apple", "banana", "avocado", "10", "200", "AVOCADO", "Banana", "Tangerine" ] } }, ) print(resp1)
const response = await client.indices.create({ index: "idx", settings: { index: { mapping: { source: { mode: "synthetic", }, }, }, }, mappings: { properties: { flattened: { type: "flattened", }, }, }, }); console.log(response); const response1 = await client.index({ index: "idx", id: 1, document: { flattened: { field: [ "apple", "apple", "banana", "avocado", "10", "200", "AVOCADO", "Banana", "Tangerine", ], }, }, }); console.log(response1);
PUT idx { "settings": { "index": { "mapping": { "source": { "mode": "synthetic" } } } }, "mappings": { "properties": { "flattened": { "type": "flattened" } } } } PUT idx/_doc/1 { "flattened": { "field": [ "apple", "apple", "banana", "avocado", "10", "200", "AVOCADO", "Banana", "Tangerine" ] } }
将变为
{ "flattened": { "field": [ "10", "200", "AVOCADO", "Banana", "Tangerine", "apple", "avocado", "banana" ] } }
合成源始终使用嵌套对象而不是对象数组。例如
resp = client.indices.create( index="idx", settings={ "index": { "mapping": { "source": { "mode": "synthetic" } } } }, mappings={ "properties": { "flattened": { "type": "flattened" } } }, ) print(resp) resp1 = client.index( index="idx", id="1", document={ "flattened": { "field": [ { "id": 1, "name": "foo" }, { "id": 2, "name": "bar" }, { "id": 3, "name": "baz" } ] } }, ) print(resp1)
const response = await client.indices.create({ index: "idx", settings: { index: { mapping: { source: { mode: "synthetic", }, }, }, }, mappings: { properties: { flattened: { type: "flattened", }, }, }, }); console.log(response); const response1 = await client.index({ index: "idx", id: 1, document: { flattened: { field: [ { id: 1, name: "foo", }, { id: 2, name: "bar", }, { id: 3, name: "baz", }, ], }, }, }); console.log(response1);
PUT idx { "settings": { "index": { "mapping": { "source": { "mode": "synthetic" } } } }, "mappings": { "properties": { "flattened": { "type": "flattened" } } } } PUT idx/_doc/1 { "flattened": { "field": [ { "id": 1, "name": "foo" }, { "id": 2, "name": "bar" }, { "id": 3, "name": "baz" } ] } }
将变为(注意嵌套对象而不是“扁平”数组)
{ "flattened": { "field": { "id": [ "1", "2", "3" ], "name": [ "bar", "baz", "foo" ] } } }
合成源始终对单元素数组使用单值字段。例如
resp = client.indices.create( index="idx", settings={ "index": { "mapping": { "source": { "mode": "synthetic" } } } }, mappings={ "properties": { "flattened": { "type": "flattened" } } }, ) print(resp) resp1 = client.index( index="idx", id="1", document={ "flattened": { "field": [ "foo" ] } }, ) print(resp1)
const response = await client.indices.create({ index: "idx", settings: { index: { mapping: { source: { mode: "synthetic", }, }, }, }, mappings: { properties: { flattened: { type: "flattened", }, }, }, }); console.log(response); const response1 = await client.index({ index: "idx", id: 1, document: { flattened: { field: ["foo"], }, }, }); console.log(response1);
PUT idx { "settings": { "index": { "mapping": { "source": { "mode": "synthetic" } } } }, "mappings": { "properties": { "flattened": { "type": "flattened" } } } } PUT idx/_doc/1 { "flattened": { "field": [ "foo" ] } }
将变为(注意嵌套对象而不是“扁平”数组)
{ "flattened": { "field": "foo" } }