经典分词过滤器编辑

经典分词器生成的词项执行可选的后处理。

此过滤器会删除单词末尾的英语所有格('s),并删除首字母缩略词中的点。它使用 Lucene 的ClassicFilter

示例编辑

以下分析 API请求演示了经典分词过滤器的工作原理。

response = client.indices.analyze(
  body: {
    tokenizer: 'classic',
    filter: [
      'classic'
    ],
    text: "The 2 Q.U.I.C.K. Brown-Foxes jumped over the lazy dog's bone."
  }
)
puts response
GET /_analyze
{
  "tokenizer" : "classic",
  "filter" : ["classic"],
  "text" : "The 2 Q.U.I.C.K. Brown-Foxes jumped over the lazy dog's bone."
}

该过滤器生成以下词项

[ The, 2, QUICK, Brown, Foxes, jumped, over, the, lazy, dog, bone ]

添加到分析器编辑

以下创建索引 API请求使用经典分词过滤器配置新的自定义分析器

response = client.indices.create(
  index: 'classic_example',
  body: {
    settings: {
      analysis: {
        analyzer: {
          classic_analyzer: {
            tokenizer: 'classic',
            filter: [
              'classic'
            ]
          }
        }
      }
    }
  }
)
puts response
PUT /classic_example
{
  "settings": {
    "analysis": {
      "analyzer": {
        "classic_analyzer": {
          "tokenizer": "classic",
          "filter": [ "classic" ]
        }
      }
    }
  }
}