距离特征查询

编辑

提升文档的相关性评分,这些文档更接近提供的 origin 日期或点。例如,您可以使用此查询来增加更接近特定日期或位置的文档的权重。

您可以使用 distance_feature 查询来查找位置的最近邻居。您还可以在 bool 搜索的 should 过滤器中使用该查询,以便为 bool 查询的分数添加提升的相关性分数。

示例请求

编辑

索引设置

编辑

要使用 distance_feature 查询,您的索引必须包含 datedate_nanosgeo_point 字段。

要查看如何为 distance_feature 查询设置索引,请尝试以下示例。

  1. 使用以下字段映射创建 items 索引

    • name,一个 keyword 字段
    • production_date,一个 date 字段
    • location,一个 geo_point 字段
    resp = client.indices.create(
        index="items",
        mappings={
            "properties": {
                "name": {
                    "type": "keyword"
                },
                "production_date": {
                    "type": "date"
                },
                "location": {
                    "type": "geo_point"
                }
            }
        },
    )
    print(resp)
    response = client.indices.create(
      index: 'items',
      body: {
        mappings: {
          properties: {
            name: {
              type: 'keyword'
            },
            production_date: {
              type: 'date'
            },
            location: {
              type: 'geo_point'
            }
          }
        }
      }
    )
    puts response
    const response = await client.indices.create({
      index: "items",
      mappings: {
        properties: {
          name: {
            type: "keyword",
          },
          production_date: {
            type: "date",
          },
          location: {
            type: "geo_point",
          },
        },
      },
    });
    console.log(response);
    PUT /items
    {
      "mappings": {
        "properties": {
          "name": {
            "type": "keyword"
          },
          "production_date": {
            "type": "date"
          },
          "location": {
            "type": "geo_point"
          }
        }
      }
    }
  2. 将多个文档索引到此索引中。

    resp = client.index(
        index="items",
        id="1",
        refresh=True,
        document={
            "name": "chocolate",
            "production_date": "2018-02-01",
            "location": [
                -71.34,
                41.12
            ]
        },
    )
    print(resp)
    
    resp1 = client.index(
        index="items",
        id="2",
        refresh=True,
        document={
            "name": "chocolate",
            "production_date": "2018-01-01",
            "location": [
                -71.3,
                41.15
            ]
        },
    )
    print(resp1)
    
    resp2 = client.index(
        index="items",
        id="3",
        refresh=True,
        document={
            "name": "chocolate",
            "production_date": "2017-12-01",
            "location": [
                -71.3,
                41.12
            ]
        },
    )
    print(resp2)
    response = client.index(
      index: 'items',
      id: 1,
      refresh: true,
      body: {
        name: 'chocolate',
        production_date: '2018-02-01',
        location: [
          -71.34,
          41.12
        ]
      }
    )
    puts response
    
    response = client.index(
      index: 'items',
      id: 2,
      refresh: true,
      body: {
        name: 'chocolate',
        production_date: '2018-01-01',
        location: [
          -71.3,
          41.15
        ]
      }
    )
    puts response
    
    response = client.index(
      index: 'items',
      id: 3,
      refresh: true,
      body: {
        name: 'chocolate',
        production_date: '2017-12-01',
        location: [
          -71.3,
          41.12
        ]
      }
    )
    puts response
    const response = await client.index({
      index: "items",
      id: 1,
      refresh: "true",
      document: {
        name: "chocolate",
        production_date: "2018-02-01",
        location: [-71.34, 41.12],
      },
    });
    console.log(response);
    
    const response1 = await client.index({
      index: "items",
      id: 2,
      refresh: "true",
      document: {
        name: "chocolate",
        production_date: "2018-01-01",
        location: [-71.3, 41.15],
      },
    });
    console.log(response1);
    
    const response2 = await client.index({
      index: "items",
      id: 3,
      refresh: "true",
      document: {
        name: "chocolate",
        production_date: "2017-12-01",
        location: [-71.3, 41.12],
      },
    });
    console.log(response2);
    PUT /items/_doc/1?refresh
    {
      "name" : "chocolate",
      "production_date": "2018-02-01",
      "location": [-71.34, 41.12]
    }
    
    PUT /items/_doc/2?refresh
    {
      "name" : "chocolate",
      "production_date": "2018-01-01",
      "location": [-71.3, 41.15]
    }
    
    
    PUT /items/_doc/3?refresh
    {
      "name" : "chocolate",
      "production_date": "2017-12-01",
      "location": [-71.3, 41.12]
    }

示例查询

编辑
根据日期提升文档
编辑

以下 bool 搜索返回 name 值为 chocolate 的文档。该搜索还使用 distance_feature 查询来增加 production_date 值更接近 now 的文档的相关性评分。

resp = client.search(
    index="items",
    query={
        "bool": {
            "must": {
                "match": {
                    "name": "chocolate"
                }
            },
            "should": {
                "distance_feature": {
                    "field": "production_date",
                    "pivot": "7d",
                    "origin": "now"
                }
            }
        }
    },
)
print(resp)
response = client.search(
  index: 'items',
  body: {
    query: {
      bool: {
        must: {
          match: {
            name: 'chocolate'
          }
        },
        should: {
          distance_feature: {
            field: 'production_date',
            pivot: '7d',
            origin: 'now'
          }
        }
      }
    }
  }
)
puts response
const response = await client.search({
  index: "items",
  query: {
    bool: {
      must: {
        match: {
          name: "chocolate",
        },
      },
      should: {
        distance_feature: {
          field: "production_date",
          pivot: "7d",
          origin: "now",
        },
      },
    },
  },
});
console.log(response);
GET /items/_search
{
  "query": {
    "bool": {
      "must": {
        "match": {
          "name": "chocolate"
        }
      },
      "should": {
        "distance_feature": {
          "field": "production_date",
          "pivot": "7d",
          "origin": "now"
        }
      }
    }
  }
}
根据位置提升文档
编辑

以下 bool 搜索返回 name 值为 chocolate 的文档。该搜索还使用 distance_feature 查询来增加 location 值更接近 [-71.3, 41.15] 的文档的相关性评分。

resp = client.search(
    index="items",
    query={
        "bool": {
            "must": {
                "match": {
                    "name": "chocolate"
                }
            },
            "should": {
                "distance_feature": {
                    "field": "location",
                    "pivot": "1000m",
                    "origin": [
                        -71.3,
                        41.15
                    ]
                }
            }
        }
    },
)
print(resp)
response = client.search(
  index: 'items',
  body: {
    query: {
      bool: {
        must: {
          match: {
            name: 'chocolate'
          }
        },
        should: {
          distance_feature: {
            field: 'location',
            pivot: '1000m',
            origin: [
              -71.3,
              41.15
            ]
          }
        }
      }
    }
  }
)
puts response
const response = await client.search({
  index: "items",
  query: {
    bool: {
      must: {
        match: {
          name: "chocolate",
        },
      },
      should: {
        distance_feature: {
          field: "location",
          pivot: "1000m",
          origin: [-71.3, 41.15],
        },
      },
    },
  },
});
console.log(response);
GET /items/_search
{
  "query": {
    "bool": {
      "must": {
        "match": {
          "name": "chocolate"
        }
      },
      "should": {
        "distance_feature": {
          "field": "location",
          "pivot": "1000m",
          "origin": [-71.3, 41.15]
        }
      }
    }
  }
}

distance_feature 的顶层参数

编辑
field

(必需,字符串)用于计算距离的字段的名称。此字段必须满足以下条件

origin

(必需,字符串)用于计算距离的起始日期或点。

如果 field 值是 datedate_nanos 字段,则 origin 值必须是 日期。支持 日期数学,例如 now-1h

如果 field 值是 geo_point 字段,则 origin 值必须是地理点。

pivot

(必需,时间单位距离单位)距离 origin 的距离,在此距离处,相关性评分将获得 boost 值的一半。

如果 field 值是 datedate_nanos 字段,则 pivot 值必须是 时间单位,例如 1h10d

如果 field 值是 geo_point 字段,则 pivot 值必须是 距离单位,例如 1km12m

boost

(可选,浮点数)用于乘以匹配文档的相关性评分的浮点数。此值不能为负。默认为 1.0

说明

编辑

distance_feature 查询如何计算相关性评分

编辑

distance_feature 查询动态计算 origin 值和文档字段值之间的距离。然后,它使用此距离作为特征来提高更近文档的相关性评分

distance_feature 查询按如下方式计算文档的相关性评分

relevance score = boost * pivot / (pivot + distance)

distanceorigin 值与文档字段值之间的绝对差值。

跳过非竞争命中

编辑

function_score 查询或更改相关性评分的其他方式不同,当 track_total_hits 参数 不是 true 时,distance_feature 查询可以有效地跳过非竞争命中。