动态模板

编辑

动态模板允许你更好地控制 Elasticsearch 如何映射你的数据,超出默认的动态字段映射规则。 你可以通过将 dynamic 参数设置为 trueruntime 来启用动态映射。 然后,你可以使用动态模板来定义自定义映射,这些映射可以根据匹配条件应用于动态添加的字段。

在映射规范中使用 {name}{dynamic_type} 模板变量作为占位符。

仅当字段包含具体值时,才会添加动态字段映射。当字段包含 null 或空数组时,Elasticsearch 不会添加动态字段映射。如果 dynamic_template 中使用了 null_value 选项,则只有在索引了第一个包含该字段的具体值的文档之后,才会应用该选项。

动态模板被指定为命名对象的数组

  "dynamic_templates": [
    {
      "my_template_name": { 
        ... match conditions ... 
        "mapping": { ... } 
      }
    },
    ...
  ]

模板名称可以是任何字符串值。

匹配条件可以包括以下任何一个:match_mapping_typematchmatch_patternunmatchpath_matchpath_unmatch

匹配的字段应使用的映射。

验证动态模板

编辑

如果提供的映射包含无效的映射片段,则会返回验证错误。 验证发生在索引时应用动态模板时,以及在大多数情况下,更新动态模板时。 提供无效的映射片段可能会导致在某些条件下动态模板的更新或验证失败

  • 如果未指定 match_mapping_type,但该模板对于至少一个预定义的映射类型有效,则该映射片段被认为是有效的。但是,如果在索引时将与该模板匹配的字段索引为不同的类型,则会在索引时返回验证错误。例如,配置没有 match_mapping_type 的动态模板被认为是有效的字符串类型,但如果将与该动态模板匹配的字段索引为 long,则会在索引时返回验证错误。建议将 match_mapping_type 配置为预期的 JSON 类型,或在映射片段中配置所需的 type
  • 如果在映射片段中使用了 {name} 占位符,则在更新动态模板时会跳过验证。 这是因为此时字段名称未知。 相反,验证发生在索引时应用模板时。

模板按顺序处理 — 第一个匹配的模板获胜。 当通过更新映射 API 放入新的动态模板时,所有现有模板都会被覆盖。 这允许在最初添加后重新排序或删除动态模板。

在动态模板中映射运行时字段

编辑

如果你希望 Elasticsearch 将某种类型的新字段动态映射为运行时字段,请在索引映射中设置 "dynamic":"runtime"。 这些字段不会被索引,并且在查询时从 _source 加载。

或者,你可以使用默认的动态映射规则,然后创建动态模板以将特定字段映射为运行时字段。 你可以在索引映射中设置 "dynamic":"true",然后创建一个动态模板以将某种类型的新字段映射为运行时字段。

假设你有的数据中每个字段都以 ip_ 开头。 根据动态映射规则,Elasticsearch 将任何通过 numeric 检测的 string 映射为 floatlong。 但是,你可以创建一个动态模板,将新字符串映射为 ip 类型的运行时字段。

以下请求定义了一个名为 strings_as_ip 的动态模板。 当 Elasticsearch 检测到与 ip* 模式匹配的新 string 字段时,它会将这些字段映射为 ip 类型的运行时字段。 因为 ip 字段不会被动态映射,所以你可以将此模板与 "dynamic":"true""dynamic":"runtime" 一起使用。

resp = client.indices.create(
    index="my-index-000001",
    mappings={
        "dynamic_templates": [
            {
                "strings_as_ip": {
                    "match_mapping_type": "string",
                    "match": "ip*",
                    "runtime": {
                        "type": "ip"
                    }
                }
            }
        ]
    },
)
print(resp)
response = client.indices.create(
  index: 'my-index-000001',
  body: {
    mappings: {
      dynamic_templates: [
        {
          strings_as_ip: {
            match_mapping_type: 'string',
            match: 'ip*',
            runtime: {
              type: 'ip'
            }
          }
        }
      ]
    }
  }
)
puts response
const response = await client.indices.create({
  index: "my-index-000001",
  mappings: {
    dynamic_templates: [
      {
        strings_as_ip: {
          match_mapping_type: "string",
          match: "ip*",
          runtime: {
            type: "ip",
          },
        },
      },
    ],
  },
});
console.log(response);
PUT my-index-000001/
{
  "mappings": {
    "dynamic_templates": [
      {
        "strings_as_ip": {
          "match_mapping_type": "string",
          "match": "ip*",
          "runtime": {
            "type": "ip"
          }
        }
      }
    ]
  }
}

请参阅此示例,了解如何使用动态模板将 string 字段映射为索引字段或运行时字段。

match_mapping_typeunmatch_mapping_type

编辑

match_mapping_type 参数按 JSON 解析器检测到的数据类型匹配字段,而 unmatch_mapping_type 根据数据类型排除字段。

因为 JSON 不区分 longintegerdoublefloat,所以任何解析的浮点数都被视为 double JSON 数据类型,而任何解析的 integer 数字都被视为 long

使用动态映射,Elasticsearch 将始终选择更宽的数据类型。 一个例外是 float,它比 double 需要更少的存储空间,并且对于大多数应用程序来说足够精确。 运行时字段不支持 float,这就是为什么 "dynamic":"runtime" 使用 double

Elasticsearch 自动检测以下数据类型

Elasticsearch 数据类型

JSON 数据类型

"dynamic":"true"

"dynamic":"runtime"

null

未添加字段

未添加字段

truefalse

布尔值

布尔值

double

float

double

long

long

long

对象

对象

未添加字段

数组

取决于数组中的第一个非 null

取决于数组中的第一个非 null

通过日期检测string

日期

日期

通过数字检测string

floatlong

doublelong

未通过 date 检测或 numeric 检测的 string

带有 .keyword 子字段的 text

keyword

你可以为 match_mapping_typeunmatch_mapping_type 参数指定单个数据类型或数据类型列表。 你还可以使用通配符 (*) 作为 match_mapping_type 参数,以匹配所有数据类型。

例如,如果我们想将所有整数字段映射为 integer 而不是 long,并将所有 string 字段映射为 textkeyword,我们可以使用以下模板

resp = client.indices.create(
    index="my-index-000001",
    mappings={
        "dynamic_templates": [
            {
                "numeric_counts": {
                    "match_mapping_type": [
                        "long",
                        "double"
                    ],
                    "match": "count",
                    "mapping": {
                        "type": "{dynamic_type}",
                        "index": False
                    }
                }
            },
            {
                "integers": {
                    "match_mapping_type": "long",
                    "mapping": {
                        "type": "integer"
                    }
                }
            },
            {
                "strings": {
                    "match_mapping_type": "string",
                    "mapping": {
                        "type": "text",
                        "fields": {
                            "raw": {
                                "type": "keyword",
                                "ignore_above": 256
                            }
                        }
                    }
                }
            },
            {
                "non_objects_keyword": {
                    "match_mapping_type": "*",
                    "unmatch_mapping_type": "object",
                    "mapping": {
                        "type": "keyword"
                    }
                }
            }
        ]
    },
)
print(resp)

resp1 = client.index(
    index="my-index-000001",
    id="1",
    document={
        "my_integer": 5,
        "my_string": "Some string",
        "my_boolean": "false",
        "field": {
            "count": 4
        }
    },
)
print(resp1)
response = client.indices.create(
  index: 'my-index-000001',
  body: {
    mappings: {
      dynamic_templates: [
        {
          numeric_counts: {
            match_mapping_type: [
              'long',
              'double'
            ],
            match: 'count',
            mapping: {
              type: '{dynamic_type}',
              index: false
            }
          }
        },
        {
          integers: {
            match_mapping_type: 'long',
            mapping: {
              type: 'integer'
            }
          }
        },
        {
          strings: {
            match_mapping_type: 'string',
            mapping: {
              type: 'text',
              fields: {
                raw: {
                  type: 'keyword',
                  ignore_above: 256
                }
              }
            }
          }
        },
        {
          non_objects_keyword: {
            match_mapping_type: '*',
            unmatch_mapping_type: 'object',
            mapping: {
              type: 'keyword'
            }
          }
        }
      ]
    }
  }
)
puts response

response = client.index(
  index: 'my-index-000001',
  id: 1,
  body: {
    my_integer: 5,
    my_string: 'Some string',
    my_boolean: 'false',
    field: {
      count: 4
    }
  }
)
puts response
const response = await client.indices.create({
  index: "my-index-000001",
  mappings: {
    dynamic_templates: [
      {
        numeric_counts: {
          match_mapping_type: ["long", "double"],
          match: "count",
          mapping: {
            type: "{dynamic_type}",
            index: false,
          },
        },
      },
      {
        integers: {
          match_mapping_type: "long",
          mapping: {
            type: "integer",
          },
        },
      },
      {
        strings: {
          match_mapping_type: "string",
          mapping: {
            type: "text",
            fields: {
              raw: {
                type: "keyword",
                ignore_above: 256,
              },
            },
          },
        },
      },
      {
        non_objects_keyword: {
          match_mapping_type: "*",
          unmatch_mapping_type: "object",
          mapping: {
            type: "keyword",
          },
        },
      },
    ],
  },
});
console.log(response);

const response1 = await client.index({
  index: "my-index-000001",
  id: 1,
  document: {
    my_integer: 5,
    my_string: "Some string",
    my_boolean: "false",
    field: {
      count: 4,
    },
  },
});
console.log(response1);
PUT my-index-000001
{
  "mappings": {
    "dynamic_templates": [
      {
        "numeric_counts": {
          "match_mapping_type": ["long", "double"],
          "match": "count",
          "mapping": {
            "type": "{dynamic_type}",
            "index": false
          }
        }
      },
      {
        "integers": {
          "match_mapping_type": "long",
          "mapping": {
            "type": "integer"
          }
        }
      },
      {
        "strings": {
          "match_mapping_type": "string",
          "mapping": {
            "type": "text",
            "fields": {
              "raw": {
                "type":  "keyword",
                "ignore_above": 256
              }
            }
          }
        }
      },
      {
        "non_objects_keyword": {
          "match_mapping_type": "*",
          "unmatch_mapping_type": "object",
          "mapping": {
            "type": "keyword"
          }
        }
      }
    ]
  }
}

PUT my-index-000001/_doc/1
{
  "my_integer": 5, 
  "my_string": "Some string", 
  "my_boolean": "false", 
  "field": {"count": 4} 
}

my_integer 字段映射为 integer

my_string 字段映射为 text,带有 keyword 多字段

my_boolean 字段映射为 keyword

field.count 字段映射为 long

matchunmatch

编辑

match 参数使用一个或多个模式来匹配字段名称,而 unmatch 使用一个或多个模式来排除 match 匹配的字段。

match_pattern 参数调整 match 参数的行为,以支持在字段名称上进行完整的 Java 正则表达式匹配,而不是简单的通配符。 例如

  "match_pattern": "regex",
  "match": "^profit_\d+$"

以下示例匹配名称以 long_ 开头的所有 string 字段(除了以 _text 结尾的字段),并将它们映射为 long 字段

resp = client.indices.create(
    index="my-index-000001",
    mappings={
        "dynamic_templates": [
            {
                "longs_as_strings": {
                    "match_mapping_type": "string",
                    "match": "long_*",
                    "unmatch": "*_text",
                    "mapping": {
                        "type": "long"
                    }
                }
            }
        ]
    },
)
print(resp)

resp1 = client.index(
    index="my-index-000001",
    id="1",
    document={
        "long_num": "5",
        "long_text": "foo"
    },
)
print(resp1)
response = client.indices.create(
  index: 'my-index-000001',
  body: {
    mappings: {
      dynamic_templates: [
        {
          longs_as_strings: {
            match_mapping_type: 'string',
            match: 'long_*',
            unmatch: '*_text',
            mapping: {
              type: 'long'
            }
          }
        }
      ]
    }
  }
)
puts response

response = client.index(
  index: 'my-index-000001',
  id: 1,
  body: {
    long_num: '5',
    long_text: 'foo'
  }
)
puts response
const response = await client.indices.create({
  index: "my-index-000001",
  mappings: {
    dynamic_templates: [
      {
        longs_as_strings: {
          match_mapping_type: "string",
          match: "long_*",
          unmatch: "*_text",
          mapping: {
            type: "long",
          },
        },
      },
    ],
  },
});
console.log(response);

const response1 = await client.index({
  index: "my-index-000001",
  id: 1,
  document: {
    long_num: "5",
    long_text: "foo",
  },
});
console.log(response1);
PUT my-index-000001
{
  "mappings": {
    "dynamic_templates": [
      {
        "longs_as_strings": {
          "match_mapping_type": "string",
          "match":   "long_*",
          "unmatch": "*_text",
          "mapping": {
            "type": "long"
          }
        }
      }
    ]
  }
}

PUT my-index-000001/_doc/1
{
  "long_num": "5", 
  "long_text": "foo" 
}

long_num 字段映射为 long

long_text 字段使用默认的 string 映射。

你可以为 matchunmatch 字段指定使用 JSON 数组的模式列表。

下一个示例匹配名称以 ip_ 开头或以 _ip 结尾的所有字段,但以 one 开头或以 two 结尾的字段除外,并将它们映射为 ip 字段

resp = client.indices.create(
    index="my-index-000001",
    mappings={
        "dynamic_templates": [
            {
                "ip_fields": {
                    "match": [
                        "ip_*",
                        "*_ip"
                    ],
                    "unmatch": [
                        "one*",
                        "*two"
                    ],
                    "mapping": {
                        "type": "ip"
                    }
                }
            }
        ]
    },
)
print(resp)

resp1 = client.index(
    index="my-index",
    id="1",
    document={
        "one_ip": "will not match",
        "ip_two": "will not match",
        "three_ip": "12.12.12.12",
        "ip_four": "13.13.13.13"
    },
)
print(resp1)
response = client.indices.create(
  index: 'my-index-000001',
  body: {
    mappings: {
      dynamic_templates: [
        {
          ip_fields: {
            match: [
              'ip_*',
              '*_ip'
            ],
            unmatch: [
              'one*',
              '*two'
            ],
            mapping: {
              type: 'ip'
            }
          }
        }
      ]
    }
  }
)
puts response

response = client.index(
  index: 'my-index',
  id: 1,
  body: {
    one_ip: 'will not match',
    ip_two: 'will not match',
    three_ip: '12.12.12.12',
    ip_four: '13.13.13.13'
  }
)
puts response
const response = await client.indices.create({
  index: "my-index-000001",
  mappings: {
    dynamic_templates: [
      {
        ip_fields: {
          match: ["ip_*", "*_ip"],
          unmatch: ["one*", "*two"],
          mapping: {
            type: "ip",
          },
        },
      },
    ],
  },
});
console.log(response);

const response1 = await client.index({
  index: "my-index",
  id: 1,
  document: {
    one_ip: "will not match",
    ip_two: "will not match",
    three_ip: "12.12.12.12",
    ip_four: "13.13.13.13",
  },
});
console.log(response1);
PUT my-index-000001
{
  "mappings": {
    "dynamic_templates": [
      {
        "ip_fields": {
          "match":   ["ip_*", "*_ip"],
          "unmatch": ["one*", "*two"],
          "mapping": {
            "type": "ip"
          }
        }
      }
    ]
  }
}

PUT my-index/_doc/1
{
  "one_ip":   "will not match", 
  "ip_two":   "will not match", 
  "three_ip": "12.12.12.12", 
  "ip_four":  "13.13.13.13" 
}

one_ip 字段不匹配,因此使用默认的 text 映射。

ip_two 字段不匹配,因此使用默认的 text 映射。

three_ip 字段映射为 ip 类型。

ip_four 字段映射为 ip 类型。

path_matchpath_unmatch

编辑

path_matchpath_unmatch 参数的工作方式与 matchunmatch 相同,但操作的是字段的完整点式路径,而不仅仅是最终名称,例如 some_object.*.some_field

此示例将 name 对象中任何字段的值复制到顶层 full_name 字段,除了 middle 字段

resp = client.indices.create(
    index="my-index-000001",
    mappings={
        "dynamic_templates": [
            {
                "full_name": {
                    "path_match": "name.*",
                    "path_unmatch": "*.middle",
                    "mapping": {
                        "type": "text",
                        "copy_to": "full_name"
                    }
                }
            }
        ]
    },
)
print(resp)

resp1 = client.index(
    index="my-index-000001",
    id="1",
    document={
        "name": {
            "first": "John",
            "middle": "Winston",
            "last": "Lennon"
        }
    },
)
print(resp1)
response = client.indices.create(
  index: 'my-index-000001',
  body: {
    mappings: {
      dynamic_templates: [
        {
          full_name: {
            path_match: 'name.*',
            path_unmatch: '*.middle',
            mapping: {
              type: 'text',
              copy_to: 'full_name'
            }
          }
        }
      ]
    }
  }
)
puts response

response = client.index(
  index: 'my-index-000001',
  id: 1,
  body: {
    name: {
      first: 'John',
      middle: 'Winston',
      last: 'Lennon'
    }
  }
)
puts response
const response = await client.indices.create({
  index: "my-index-000001",
  mappings: {
    dynamic_templates: [
      {
        full_name: {
          path_match: "name.*",
          path_unmatch: "*.middle",
          mapping: {
            type: "text",
            copy_to: "full_name",
          },
        },
      },
    ],
  },
});
console.log(response);

const response1 = await client.index({
  index: "my-index-000001",
  id: 1,
  document: {
    name: {
      first: "John",
      middle: "Winston",
      last: "Lennon",
    },
  },
});
console.log(response1);
PUT my-index-000001
{
  "mappings": {
    "dynamic_templates": [
      {
        "full_name": {
          "path_match":   "name.*",
          "path_unmatch": "*.middle",
          "mapping": {
            "type":       "text",
            "copy_to":    "full_name"
          }
        }
      }
    ]
  }
}

PUT my-index-000001/_doc/1
{
  "name": {
    "first":  "John",
    "middle": "Winston",
    "last":   "Lennon"
  }
}

以下示例对 path_matchpath_unmatch 都使用模式数组。

name 对象或 user.name 对象中任何字段的值复制到顶层 full_name 字段,除了 middlemidinitial 字段

resp = client.indices.create(
    index="my-index-000001",
    mappings={
        "dynamic_templates": [
            {
                "full_name": {
                    "path_match": [
                        "name.*",
                        "user.name.*"
                    ],
                    "path_unmatch": [
                        "*.middle",
                        "*.midinitial"
                    ],
                    "mapping": {
                        "type": "text",
                        "copy_to": "full_name"
                    }
                }
            }
        ]
    },
)
print(resp)

resp1 = client.index(
    index="my-index-000001",
    id="1",
    document={
        "name": {
            "first": "John",
            "middle": "Winston",
            "last": "Lennon"
        }
    },
)
print(resp1)

resp2 = client.index(
    index="my-index-000001",
    id="2",
    document={
        "user": {
            "name": {
                "first": "Jane",
                "midinitial": "M",
                "last": "Salazar"
            }
        }
    },
)
print(resp2)
response = client.indices.create(
  index: 'my-index-000001',
  body: {
    mappings: {
      dynamic_templates: [
        {
          full_name: {
            path_match: [
              'name.*',
              'user.name.*'
            ],
            path_unmatch: [
              '*.middle',
              '*.midinitial'
            ],
            mapping: {
              type: 'text',
              copy_to: 'full_name'
            }
          }
        }
      ]
    }
  }
)
puts response

response = client.index(
  index: 'my-index-000001',
  id: 1,
  body: {
    name: {
      first: 'John',
      middle: 'Winston',
      last: 'Lennon'
    }
  }
)
puts response

response = client.index(
  index: 'my-index-000001',
  id: 2,
  body: {
    user: {
      name: {
        first: 'Jane',
        midinitial: 'M',
        last: 'Salazar'
      }
    }
  }
)
puts response
const response = await client.indices.create({
  index: "my-index-000001",
  mappings: {
    dynamic_templates: [
      {
        full_name: {
          path_match: ["name.*", "user.name.*"],
          path_unmatch: ["*.middle", "*.midinitial"],
          mapping: {
            type: "text",
            copy_to: "full_name",
          },
        },
      },
    ],
  },
});
console.log(response);

const response1 = await client.index({
  index: "my-index-000001",
  id: 1,
  document: {
    name: {
      first: "John",
      middle: "Winston",
      last: "Lennon",
    },
  },
});
console.log(response1);

const response2 = await client.index({
  index: "my-index-000001",
  id: 2,
  document: {
    user: {
      name: {
        first: "Jane",
        midinitial: "M",
        last: "Salazar",
      },
    },
  },
});
console.log(response2);
PUT my-index-000001
{
  "mappings": {
    "dynamic_templates": [
      {
        "full_name": {
          "path_match":   ["name.*", "user.name.*"],
          "path_unmatch": ["*.middle", "*.midinitial"],
          "mapping": {
            "type":       "text",
            "copy_to":    "full_name"
          }
        }
      }
    ]
  }
}

PUT my-index-000001/_doc/1
{
  "name": {
    "first":  "John",
    "middle": "Winston",
    "last":   "Lennon"
  }
}

PUT my-index-000001/_doc/2
{
  "user": {
    "name": {
      "first":      "Jane",
      "midinitial": "M",
      "last":       "Salazar"
    }
  }
}

请注意,path_matchpath_unmatch 参数除了匹配叶字段外,还匹配对象路径。 例如,索引以下文档会导致错误,因为 path_match 设置还会匹配对象字段 name.title,该字段无法映射为文本

resp = client.index(
    index="my-index-000001",
    id="2",
    document={
        "name": {
            "first": "Paul",
            "last": "McCartney",
            "title": {
                "value": "Sir",
                "category": "order of chivalry"
            }
        }
    },
)
print(resp)
response = client.index(
  index: 'my-index-000001',
  id: 2,
  body: {
    name: {
      first: 'Paul',
      last: 'McCartney',
      title: {
        value: 'Sir',
        category: 'order of chivalry'
      }
    }
  }
)
puts response
const response = await client.index({
  index: "my-index-000001",
  id: 2,
  document: {
    name: {
      first: "Paul",
      last: "McCartney",
      title: {
        value: "Sir",
        category: "order of chivalry",
      },
    },
  },
});
console.log(response);
PUT my-index-000001/_doc/2
{
  "name": {
    "first":  "Paul",
    "last":   "McCartney",
    "title": {
      "value": "Sir",
      "category": "order of chivalry"
    }
  }
}

模板变量

编辑

mapping 中,{name}{dynamic_type} 占位符会被替换为字段名称和检测到的动态类型。以下示例将所有字符串字段设置为使用与该字段同名的 analyzer,并为所有非字符串字段禁用 doc_values

resp = client.indices.create(
    index="my-index-000001",
    mappings={
        "dynamic_templates": [
            {
                "named_analyzers": {
                    "match_mapping_type": "string",
                    "match": "*",
                    "mapping": {
                        "type": "text",
                        "analyzer": "{name}"
                    }
                }
            },
            {
                "no_doc_values": {
                    "match_mapping_type": "*",
                    "mapping": {
                        "type": "{dynamic_type}",
                        "doc_values": False
                    }
                }
            }
        ]
    },
)
print(resp)

resp1 = client.index(
    index="my-index-000001",
    id="1",
    document={
        "english": "Some English text",
        "count": 5
    },
)
print(resp1)
response = client.indices.create(
  index: 'my-index-000001',
  body: {
    mappings: {
      dynamic_templates: [
        {
          named_analyzers: {
            match_mapping_type: 'string',
            match: '*',
            mapping: {
              type: 'text',
              analyzer: '{name}'
            }
          }
        },
        {
          no_doc_values: {
            match_mapping_type: '*',
            mapping: {
              type: '{dynamic_type}',
              doc_values: false
            }
          }
        }
      ]
    }
  }
)
puts response

response = client.index(
  index: 'my-index-000001',
  id: 1,
  body: {
    english: 'Some English text',
    count: 5
  }
)
puts response
const response = await client.indices.create({
  index: "my-index-000001",
  mappings: {
    dynamic_templates: [
      {
        named_analyzers: {
          match_mapping_type: "string",
          match: "*",
          mapping: {
            type: "text",
            analyzer: "{name}",
          },
        },
      },
      {
        no_doc_values: {
          match_mapping_type: "*",
          mapping: {
            type: "{dynamic_type}",
            doc_values: false,
          },
        },
      },
    ],
  },
});
console.log(response);

const response1 = await client.index({
  index: "my-index-000001",
  id: 1,
  document: {
    english: "Some English text",
    count: 5,
  },
});
console.log(response1);
PUT my-index-000001
{
  "mappings": {
    "dynamic_templates": [
      {
        "named_analyzers": {
          "match_mapping_type": "string",
          "match": "*",
          "mapping": {
            "type": "text",
            "analyzer": "{name}"
          }
        }
      },
      {
        "no_doc_values": {
          "match_mapping_type":"*",
          "mapping": {
            "type": "{dynamic_type}",
            "doc_values": false
          }
        }
      }
    ]
  }
}

PUT my-index-000001/_doc/1
{
  "english": "Some English text", 
  "count":   5 
}

english 字段被映射为一个使用 english 分析器的 string 字段。

count 字段被映射为一个禁用 doc_valueslong 字段。

动态模板示例

编辑

以下是一些可能有用的动态模板示例

结构化搜索

编辑

当您设置 "dynamic":"true" 时,Elasticsearch 会将字符串字段映射为带有 keyword 子字段的 text 字段。如果您只索引结构化内容,而不对全文搜索感兴趣,您可以让 Elasticsearch 将您的字段仅映射为 keyword 字段。但是,您必须搜索与索引的值完全相同的值才能搜索这些字段。

resp = client.indices.create(
    index="my-index-000001",
    mappings={
        "dynamic_templates": [
            {
                "strings_as_keywords": {
                    "match_mapping_type": "string",
                    "mapping": {
                        "type": "keyword"
                    }
                }
            }
        ]
    },
)
print(resp)
response = client.indices.create(
  index: 'my-index-000001',
  body: {
    mappings: {
      dynamic_templates: [
        {
          strings_as_keywords: {
            match_mapping_type: 'string',
            mapping: {
              type: 'keyword'
            }
          }
        }
      ]
    }
  }
)
puts response
const response = await client.indices.create({
  index: "my-index-000001",
  mappings: {
    dynamic_templates: [
      {
        strings_as_keywords: {
          match_mapping_type: "string",
          mapping: {
            type: "keyword",
          },
        },
      },
    ],
  },
});
console.log(response);
PUT my-index-000001
{
  "mappings": {
    "dynamic_templates": [
      {
        "strings_as_keywords": {
          "match_mapping_type": "string",
          "mapping": {
            "type": "keyword"
          }
        }
      }
    ]
  }
}

字符串的仅 text 映射

编辑

与前面的示例相反,如果您只关心字符串字段的全文搜索,而不打算运行聚合、排序或精确搜索,您可以指示 Elasticsearch 将字符串映射为 text

resp = client.indices.create(
    index="my-index-000001",
    mappings={
        "dynamic_templates": [
            {
                "strings_as_text": {
                    "match_mapping_type": "string",
                    "mapping": {
                        "type": "text"
                    }
                }
            }
        ]
    },
)
print(resp)
response = client.indices.create(
  index: 'my-index-000001',
  body: {
    mappings: {
      dynamic_templates: [
        {
          strings_as_text: {
            match_mapping_type: 'string',
            mapping: {
              type: 'text'
            }
          }
        }
      ]
    }
  }
)
puts response
const response = await client.indices.create({
  index: "my-index-000001",
  mappings: {
    dynamic_templates: [
      {
        strings_as_text: {
          match_mapping_type: "string",
          mapping: {
            type: "text",
          },
        },
      },
    ],
  },
});
console.log(response);
PUT my-index-000001
{
  "mappings": {
    "dynamic_templates": [
      {
        "strings_as_text": {
          "match_mapping_type": "string",
          "mapping": {
            "type": "text"
          }
        }
      }
    ]
  }
}

或者,您可以在映射的 runtime 部分创建一个动态模板,将您的字符串字段映射为 keyword 字段。当 Elasticsearch 检测到新的 string 类型字段时,这些字段将根据 Elasticsearch 用于向映射添加字段类型的动态映射规则创建为 keyword 类型的 runtime 字段。任何未通过日期检测或数值检测的 string 都会自动映射为 keyword

虽然您的 string 字段不会被索引,但它们的值存储在 _source 中,并可以在搜索请求、聚合、过滤和排序中使用。

例如,以下请求创建一个动态模板,将 string 字段映射为 keyword 类型的 runtime 字段。虽然 runtime 定义为空,但新的 string 字段将根据 Elasticsearch 用于向映射添加字段类型的动态映射规则映射为 keyword runtime 字段。任何未通过日期检测或数值检测的 string 都会自动映射为 keyword

resp = client.indices.create(
    index="my-index-000001",
    mappings={
        "dynamic_templates": [
            {
                "strings_as_keywords": {
                    "match_mapping_type": "string",
                    "runtime": {}
                }
            }
        ]
    },
)
print(resp)
response = client.indices.create(
  index: 'my-index-000001',
  body: {
    mappings: {
      dynamic_templates: [
        {
          strings_as_keywords: {
            match_mapping_type: 'string',
            runtime: {}
          }
        }
      ]
    }
  }
)
puts response
const response = await client.indices.create({
  index: "my-index-000001",
  mappings: {
    dynamic_templates: [
      {
        strings_as_keywords: {
          match_mapping_type: "string",
          runtime: {},
        },
      },
    ],
  },
});
console.log(response);
PUT my-index-000001
{
  "mappings": {
    "dynamic_templates": [
      {
        "strings_as_keywords": {
          "match_mapping_type": "string",
          "runtime": {}
        }
      }
    ]
  }
}

您索引一个简单的文档

resp = client.index(
    index="my-index-000001",
    id="1",
    document={
        "english": "Some English text",
        "count": 5
    },
)
print(resp)
response = client.index(
  index: 'my-index-000001',
  id: 1,
  body: {
    english: 'Some English text',
    count: 5
  }
)
puts response
const response = await client.index({
  index: "my-index-000001",
  id: 1,
  document: {
    english: "Some English text",
    count: 5,
  },
});
console.log(response);
PUT my-index-000001/_doc/1
{
  "english": "Some English text",
  "count":   5
}

当您查看映射时,您会看到 english 字段是 keyword 类型的 runtime 字段

resp = client.indices.get_mapping(
    index="my-index-000001",
)
print(resp)
response = client.indices.get_mapping(
  index: 'my-index-000001'
)
puts response
const response = await client.indices.getMapping({
  index: "my-index-000001",
});
console.log(response);
GET my-index-000001/_mapping
{
  "my-index-000001" : {
    "mappings" : {
      "dynamic_templates" : [
        {
          "strings_as_keywords" : {
            "match_mapping_type" : "string",
            "runtime" : { }
          }
        }
      ],
      "runtime" : {
        "english" : {
          "type" : "keyword"
        }
      },
      "properties" : {
        "count" : {
          "type" : "long"
        }
      }
    }
  }
}

禁用 norms

编辑

Norms 是索引时的评分因素。如果您不关心评分(例如,如果您从不按分数对文档进行排序),您可以禁用索引中这些评分因素的存储,并节省一些空间。

resp = client.indices.create(
    index="my-index-000001",
    mappings={
        "dynamic_templates": [
            {
                "strings_as_keywords": {
                    "match_mapping_type": "string",
                    "mapping": {
                        "type": "text",
                        "norms": False,
                        "fields": {
                            "keyword": {
                                "type": "keyword",
                                "ignore_above": 256
                            }
                        }
                    }
                }
            }
        ]
    },
)
print(resp)
response = client.indices.create(
  index: 'my-index-000001',
  body: {
    mappings: {
      dynamic_templates: [
        {
          strings_as_keywords: {
            match_mapping_type: 'string',
            mapping: {
              type: 'text',
              norms: false,
              fields: {
                keyword: {
                  type: 'keyword',
                  ignore_above: 256
                }
              }
            }
          }
        }
      ]
    }
  }
)
puts response
const response = await client.indices.create({
  index: "my-index-000001",
  mappings: {
    dynamic_templates: [
      {
        strings_as_keywords: {
          match_mapping_type: "string",
          mapping: {
            type: "text",
            norms: false,
            fields: {
              keyword: {
                type: "keyword",
                ignore_above: 256,
              },
            },
          },
        },
      },
    ],
  },
});
console.log(response);
PUT my-index-000001
{
  "mappings": {
    "dynamic_templates": [
      {
        "strings_as_keywords": {
          "match_mapping_type": "string",
          "mapping": {
            "type": "text",
            "norms": false,
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          }
        }
      }
    ]
  }
}

keyword 字段出现在此模板中,以与动态映射的默认规则保持一致。当然,如果您不需要它们,因为您不需要执行精确搜索或聚合此字段,您可以按照上一节中的描述将其删除。

时间序列

编辑

当使用 Elasticsearch 进行时间序列分析时,通常会有许多数字字段,您经常会对这些字段进行聚合,但从不进行过滤。在这种情况下,您可以禁用这些字段的索引以节省磁盘空间,也可能会提高索引速度。

resp = client.indices.create(
    index="my-index-000001",
    mappings={
        "dynamic_templates": [
            {
                "unindexed_longs": {
                    "match_mapping_type": "long",
                    "mapping": {
                        "type": "long",
                        "index": False
                    }
                }
            },
            {
                "unindexed_doubles": {
                    "match_mapping_type": "double",
                    "mapping": {
                        "type": "float",
                        "index": False
                    }
                }
            }
        ]
    },
)
print(resp)
response = client.indices.create(
  index: 'my-index-000001',
  body: {
    mappings: {
      dynamic_templates: [
        {
          unindexed_longs: {
            match_mapping_type: 'long',
            mapping: {
              type: 'long',
              index: false
            }
          }
        },
        {
          unindexed_doubles: {
            match_mapping_type: 'double',
            mapping: {
              type: 'float',
              index: false
            }
          }
        }
      ]
    }
  }
)
puts response
const response = await client.indices.create({
  index: "my-index-000001",
  mappings: {
    dynamic_templates: [
      {
        unindexed_longs: {
          match_mapping_type: "long",
          mapping: {
            type: "long",
            index: false,
          },
        },
      },
      {
        unindexed_doubles: {
          match_mapping_type: "double",
          mapping: {
            type: "float",
            index: false,
          },
        },
      },
    ],
  },
});
console.log(response);
PUT my-index-000001
{
  "mappings": {
    "dynamic_templates": [
      {
        "unindexed_longs": {
          "match_mapping_type": "long",
          "mapping": {
            "type": "long",
            "index": false
          }
        }
      },
      {
        "unindexed_doubles": {
          "match_mapping_type": "double",
          "mapping": {
            "type": "float", 
            "index": false
          }
        }
      }
    ]
  }
}

与默认的动态映射规则一样,双精度浮点数会被映射为单精度浮点数,单精度浮点数通常足够精确,但只需要一半的磁盘空间。