› ›

查询和过滤上下文

编辑

查询和过滤上下文

编辑

查询上下文

编辑

在查询上下文中，查询子句回答“此文档与该查询子句匹配程度如何？”的问题。除了确定文档是否匹配之外，查询子句还会在 _score 元数据字段中计算相关性评分。

只要查询子句传递给 query 参数（例如，搜索 API 中的 query 参数），查询上下文就会生效。

过滤上下文

编辑

过滤器回答二元问题“此文档是否与该查询子句匹配？”。答案简单地是“是”或“否”。过滤具有以下几个好处

简单的二元逻辑：在过滤上下文中，查询子句根据是/否标准确定文档匹配，无需评分计算。
性能：由于它们不计算相关性评分，因此过滤器的执行速度比查询快。
缓存：Elasticsearch 会自动缓存经常使用的过滤器，从而加快后续搜索性能。
资源效率：与全文查询相比，过滤器消耗更少的 CPU 资源。
查询组合：过滤器可以与评分查询结合使用，以有效地细化结果集。

过滤器对于查询结构化数据和在复杂搜索中实现“必须具备”条件特别有效。

结构化数据是指以预定义的方式高度组织和格式化的信息。在 Elasticsearch 的上下文中，这通常包括

数值字段（整数、浮点数）
日期和时间戳
布尔值
关键字字段（精确匹配字符串）
地理点和地理形状

与全文字段不同，结构化数据具有始终如一、可预测的格式，使其成为精确过滤操作的理想选择。

常见的过滤器应用包括

日期范围检查：例如，timestamp 字段是否在 2015 年到 2016 年之间
特定字段值检查：例如，status 字段是否等于“已发布”，或者 author 字段是否等于“John Doe”

当查询子句传递给 filter 参数时，会应用过滤上下文，例如

bool 查询中的 filter 或 must_not 参数
constant_score 查询中的 filter 参数
filter 聚合

过滤器优化查询性能和效率，尤其是在结构化数据查询以及与全文搜索结合使用时。

查询和过滤上下文的示例

编辑

以下是查询子句在 search API 的查询和过滤上下文中使用的示例。此查询将匹配满足以下所有条件的文档

title 字段包含单词 search。
content 字段包含单词 elasticsearch。
status 字段包含精确单词 published。
publish_date 字段包含 2015 年 1 月 1 日及以后的日期。

$params = [
    'body' => [
        'query' => [
            'bool' => [
                'must' => [
                    [
                        'match' => [
                            'title' => 'Search',
                        ],
                    ],
                    [
                        'match' => [
                            'content' => 'Elasticsearch',
                        ],
                    ],
                ],
                'filter' => [
                    [
                        'term' => [
                            'status' => 'published',
                        ],
                    ],
                    [
                        'range' => [
                            'publish_date' => [
                                'gte' => '2015-01-01',
                            ],
                        ],
                    ],
                ],
            ],
        ],
    ],
];
$response = $client->search($params);

resp = client.search(
    query={
        "bool": {
            "must": [
                {
                    "match": {
                        "title": "Search"
                    }
                },
                {
                    "match": {
                        "content": "Elasticsearch"
                    }
                }
            ],
            "filter": [
                {
                    "term": {
                        "status": "published"
                    }
                },
                {
                    "range": {
                        "publish_date": {
                            "gte": "2015-01-01"
                        }
                    }
                }
            ]
        }
    },
)
print(resp)

response = client.search(
  body: {
    query: {
      bool: {
        must: [
          {
            match: {
              title: 'Search'
            }
          },
          {
            match: {
              content: 'Elasticsearch'
            }
          }
        ],
        filter: [
          {
            term: {
              status: 'published'
            }
          },
          {
            range: {
              publish_date: {
                gte: '2015-01-01'
              }
            }
          }
        ]
      }
    }
  }
)
puts response

res, err := es.Search(
	es.Search.WithBody(strings.NewReader(`{
	  "query": {
	    "bool": {
	      "must": [
	        {
	          "match": {
	            "title": "Search"
	          }
	        },
	        {
	          "match": {
	            "content": "Elasticsearch"
	          }
	        }
	      ],
	      "filter": [
	        {
	          "term": {
	            "status": "published"
	          }
	        },
	        {
	          "range": {
	            "publish_date": {
	              "gte": "2015-01-01"
	            }
	          }
	        }
	      ]
	    }
	  }
	}`)),
	es.Search.WithPretty(),
)
fmt.Println(res, err)

const response = await client.search({
  query: {
    bool: {
      must: [
        {
          match: {
            title: "Search",
          },
        },
        {
          match: {
            content: "Elasticsearch",
          },
        },
      ],
      filter: [
        {
          term: {
            status: "published",
          },
        },
        {
          range: {
            publish_date: {
              gte: "2015-01-01",
            },
          },
        },
      ],
    },
  },
});
console.log(response);

GET /_search
{
  "query": { 
    "bool": { 
      "must": [
        { "match": { "title":   "Search"        }},
        { "match": { "content": "Elasticsearch" }}
      ],
      "filter": [ 
        { "term":  { "status": "published" }},
        { "range": { "publish_date": { "gte": "2015-01-01" }}}
      ]
    }
  }
}

	`query` 参数指示查询上下文。
	`bool` 和两个 `match` 子句在查询上下文中使用，这意味着它们用于评分每个文档的匹配程度。
	`filter` 参数指示过滤上下文。它的 `term` 和 `range` 子句在过滤上下文中使用。它们将过滤掉不匹配的文档，但不会影响匹配文档的评分。

在查询上下文中计算的查询评分表示为单精度浮点数；它们仅对有效数有 24 位精度。超过有效数精度的评分计算将转换为浮点数，并损失精度。

在查询上下文中使用查询子句来表示应影响匹配文档评分的条件（即文档匹配程度），并在过滤上下文中使用所有其他查询子句。

« 查询 DSL 组合查询 »