› › ›

数值字段类型

支持以下数值类型：

`long`	带符号的 64 位整数，最小值为 `-2⁶³`，最大值为 `2⁶³-1`。
`integer`	带符号的 32 位整数，最小值为 `-2³¹`，最大值为 `2³¹-1`。
`short`	带符号的 16 位整数，最小值为 `-32,768`，最大值为 `32,767`。
`byte`	带符号的 8 位整数，最小值为 `-128`，最大值为 `127`。
`double`	双精度 64 位 IEEE 754 浮点数，限制为有限值。
`float`	单精度 32 位 IEEE 754 浮点数，限制为有限值。
`half_float`	半精度 16 位 IEEE 754 浮点数，限制为有限值。
`scaled_float`	一个浮点数，由一个固定的 `double` 比例因子缩放的 `long` 支持。
`unsigned_long`	无符号 64 位整数，最小值为 0，最大值为 `2⁶⁴-1`。

以下是配置具有数值字段的映射的示例

resp = client.indices.create(
    index="my-index-000001",
    mappings={
        "properties": {
            "number_of_bytes": {
                "type": "integer"
            },
            "time_in_seconds": {
                "type": "float"
            },
            "price": {
                "type": "scaled_float",
                "scaling_factor": 100
            }
        }
    },
)
print(resp)

response = client.indices.create(
  index: 'my-index-000001',
  body: {
    mappings: {
      properties: {
        number_of_bytes: {
          type: 'integer'
        },
        time_in_seconds: {
          type: 'float'
        },
        price: {
          type: 'scaled_float',
          scaling_factor: 100
        }
      }
    }
  }
)
puts response

res, err := es.Indices.Create(
	"my-index-000001",
	es.Indices.Create.WithBody(strings.NewReader(`{
	  "mappings": {
	    "properties": {
	      "number_of_bytes": {
	        "type": "integer"
	      },
	      "time_in_seconds": {
	        "type": "float"
	      },
	      "price": {
	        "type": "scaled_float",
	        "scaling_factor": 100
	      }
	    }
	  }
	}`)),
)
fmt.Println(res, err)

const response = await client.indices.create({
  index: "my-index-000001",
  mappings: {
    properties: {
      number_of_bytes: {
        type: "integer",
      },
      time_in_seconds: {
        type: "float",
      },
      price: {
        type: "scaled_float",
        scaling_factor: 100,
      },
    },
  },
});
console.log(response);

PUT my-index-000001
{
  "mappings": {
    "properties": {
      "number_of_bytes": {
        "type": "integer"
      },
      "time_in_seconds": {
        "type": "float"
      },
      "price": {
        "type": "scaled_float",
        "scaling_factor": 100
      }
    }
  }
}

Copy as curl Try in Elastic

double、float 和 half_float 类型认为 -0.0 和 +0.0 是不同的值。因此，在 -0.0 上执行 term 查询将不会匹配 +0.0，反之亦然。范围查询也是如此：如果上限是 -0.0，则 +0.0 将不匹配，如果下限是 +0.0，则 -0.0 将不匹配。

我应该使用哪种类型？

编辑

就整数类型（byte、short、integer 和 long）而言，您应该选择足以满足您用例的最小类型。这将有助于提高索引和搜索的效率。但请注意，存储是根据实际存储的值进行优化的，因此选择一种类型而不是另一种类型不会对存储需求产生影响。

对于浮点类型，通常更有效的方法是使用比例因子将浮点数据存储为整数，这正是 scaled_float 类型在底层所做的事情。例如，price 字段可以存储在 scaled_float 中，scaling_factor 为 100。所有 API 的工作方式就像该字段存储为 double 一样，但在底层，Elasticsearch 将使用美分数量 price*100，这是一个整数。这主要有助于节省磁盘空间，因为整数比浮点数更容易压缩。scaled_float 也非常适合用于牺牲精度来换取磁盘空间。例如，假设您正在跟踪 CPU 利用率，其数值介于 0 和 1 之间。CPU 利用率是 12.7% 还是 13% 通常并不重要，因此您可以使用 scaling_factor 为 100 的 scaled_float，以便将 CPU 利用率四舍五入到最接近的百分比，从而节省空间。

如果 scaled_float 不合适，那么您应该在浮点类型中选择足够满足用例的最小类型：double、float 和 half_float。以下是一个比较这些类型的表，以帮助您做出决定。

类型	最小值	最大值	有效位数位/数字	示例精度损失
`double`	`2^-1074`	`(2-2^-52)·2¹⁰²³`	`53` / `15.95`	`1.2345678912345678`→ `1.234567891234568`
`float`	`2^-149`	`(2-2^-23)·2¹²⁷`	`24` / `7.22`	`1.23456789`→ `1.2345679`
`half_float`	`2^-24`	`65504`	`11` / `3.31`	`1.2345`→ `1.234375`

映射数值标识符

并非所有数值数据都应映射为数值字段数据类型。Elasticsearch 针对 range 查询优化数值字段，例如 integer 或 long。但是，keyword 字段更适合 term 和其他 term 级查询。

标识符，例如 ISBN 或产品 ID，很少在 range 查询中使用。但是，它们通常使用 term 级查询检索。

如果满足以下条件，请考虑将数值标识符映射为 keyword：

您不打算使用 range 查询搜索标识符数据。
快速检索很重要。keyword 字段上的 term 查询搜索通常比数值字段上的 term 搜索更快。

如果您不确定使用哪种，则可以使用多字段将数据映射为 keyword和数值数据类型。

数值字段的参数

编辑

数值类型接受以下参数

coerce

尝试将字符串转换为数字并截断整数的分数。接受 true（默认）和 false。不适用于 unsigned_long。请注意，如果使用 script 参数，则无法设置此参数。

doc_values

是否应将字段以列式方式存储在磁盘上，以便以后可用于排序、聚合或脚本？接受 true（默认）或 false。

ignore_malformed

如果为 true，则会忽略格式错误的数字。如果为 false（默认），则格式错误的数字会引发异常并拒绝整个文档。请注意，如果使用 script 参数，则无法设置此参数。

index

该字段是否应可快速搜索？接受 true（默认）和 false。只有启用了 doc_values 的数值字段也可以被查询，尽管速度较慢。

meta

有关该字段的元数据。

null_value

接受与字段相同 type 的数值，该数值将替换任何显式 null 值。默认为 null，这意味着该字段被视为缺失。请注意，如果使用 script 参数，则无法设置此参数。

on_script_error

定义如果 script 参数定义的脚本在索引时抛出错误时该怎么做。接受 fail（默认），这将导致整个文档被拒绝，以及 continue，这将把该字段注册到文档的_ignored 元数据字段中，并继续索引。只有在设置了 script 字段时，才能设置此参数。

script

如果设置此参数，则该字段将索引此脚本生成的值，而不是直接从源读取值。如果在输入文档中为此字段设置了值，则该文档将被拒绝并出现错误。脚本的格式与其运行时等效项相同。只能在 long 和 double 字段类型上配置脚本。

store

字段值是否应存储并可与_source 字段分开检索。接受 true 或 false（默认）。

time_series_dimension

（可选，布尔值）

将字段标记为时间序列维度。默认为 false。

index.mapping.dimension_fields.limit 索引设置限制索引中维度的数量。

维度字段具有以下约束

doc_values 和 index 映射参数必须为 true。

在数值字段类型中，只有 byte、short、integer、long 和 unsigned_long 字段支持此参数。

数值字段不能既是时间序列维度又是时间序列指标。

time_series_metric

（可选，字符串）将该字段标记为时间序列指标。该值为指标类型。您无法更新现有字段的此参数。

数值字段的有效 time_series_metric 值

counter: 一个累积指标，只会单调递增或重置为 0（零）。例如，错误或已完成任务的计数。
gauge: 一个表示单个数值的指标，可以任意增大或减小。例如，温度或可用磁盘空间。
null（默认）: 不是时间序列指标。

对于数值时间序列指标，doc_values 参数必须为 true。数值字段不能既是时间序列维度又是时间序列指标。

`scaled_float` 的参数

编辑

scaled_float 接受一个附加参数

scaling_factor

编码值时使用的比例因子。在索引时，值将乘以该因子并四舍五入到最接近的 long 值。例如，scaling_factor 为 10 的 scaled_float 将在内部将 2.34 存储为 23，并且所有搜索时操作（查询、聚合、排序）的行为都将如同文档的值为 2.3。较高的 scaling_factor 值可以提高精度，但也会增加空间需求。此参数是必需的。

`scaled_float` 饱和

编辑

scaled_float 存储为单个 long 值，该值是原始值乘以比例因子的乘积。如果乘法运算导致的值超出 long 的范围，则该值将饱和到 long 的最小值或最大值。例如，如果比例因子为 100，且值为 92233720368547758.08，则预期值为 9223372036854775808。但是，存储的值为 9223372036854775807，这是 long 的最大值。

当比例因子或提供的 float 值非常大时，这可能会导致范围查询出现意外结果。

合成 `_source`

编辑

合成 _source 仅对 TSDB 索引（index.mode 设置为 time_series 的索引）普遍可用。对于其他索引，合成 _source 处于技术预览状态。技术预览版中的功能可能会在未来的版本中更改或删除。Elastic 将努力解决任何问题，但技术预览版中的功能不受官方 GA 功能的支持 SLA 的约束。

所有数值字段在其默认配置中都支持合成 _source。合成 _source 不能与 copy_to 一起使用，也不能与禁用 doc_values 一起使用。

合成源可能会对数值字段值进行排序。例如

resp = client.indices.create(
    index="idx",
    settings={
        "index": {
            "mapping": {
                "source": {
                    "mode": "synthetic"
                }
            }
        }
    },
    mappings={
        "properties": {
            "long": {
                "type": "long"
            }
        }
    },
)
print(resp)

resp1 = client.index(
    index="idx",
    id="1",
    document={
        "long": [
            0,
            0,
            -123466,
            87612
        ]
    },
)
print(resp1)

const response = await client.indices.create({
  index: "idx",
  settings: {
    index: {
      mapping: {
        source: {
          mode: "synthetic",
        },
      },
    },
  },
  mappings: {
    properties: {
      long: {
        type: "long",
      },
    },
  },
});
console.log(response);

const response1 = await client.index({
  index: "idx",
  id: 1,
  document: {
    long: [0, 0, -123466, 87612],
  },
});
console.log(response1);

PUT idx
{
  "settings": {
    "index": {
      "mapping": {
        "source": {
          "mode": "synthetic"
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "long": { "type": "long" }
    }
  }
}
PUT idx/_doc/1
{
  "long": [0, 0, -123466, 87612]
}

Copy as curl Try in Elastic

将变为

{
  "long": [-123466, 0, 0, 87612]
}

缩放浮点数将始终应用其缩放因子，因此

resp = client.indices.create(
    index="idx",
    settings={
        "index": {
            "mapping": {
                "source": {
                    "mode": "synthetic"
                }
            }
        }
    },
    mappings={
        "properties": {
            "f": {
                "type": "scaled_float",
                "scaling_factor": 0.01
            }
        }
    },
)
print(resp)

resp1 = client.index(
    index="idx",
    id="1",
    document={
        "f": 123
    },
)
print(resp1)

const response = await client.indices.create({
  index: "idx",
  settings: {
    index: {
      mapping: {
        source: {
          mode: "synthetic",
        },
      },
    },
  },
  mappings: {
    properties: {
      f: {
        type: "scaled_float",
        scaling_factor: 0.01,
      },
    },
  },
});
console.log(response);

const response1 = await client.index({
  index: "idx",
  id: 1,
  document: {
    f: 123,
  },
});
console.log(response1);

PUT idx
{
  "settings": {
    "index": {
      "mapping": {
        "source": {
          "mode": "synthetic"
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "f": { "type": "scaled_float", "scaling_factor": 0.01 }
    }
  }
}
PUT idx/_doc/1
{
  "f": 123
}

Copy as curl Try in Elastic

将变为

{
  "f": 100.0
}

« 嵌套字段类型对象字段类型 »

Was this helpful?

Feedback

The Search AI Company

ELK Stack

Elastic Cloud

Generative AI

Search

Security

Observability

By solution

Industries

Customer spotlight

Research

Build

Learn

Connect

数值字段类型