polish_stop 分词过滤器
编辑polish_stop
分词过滤器编辑
polish_stop
分词过滤器会过滤掉波兰语停用词(_polish_
)以及用户指定的任何其他自定义停用词。此过滤器仅支持预定义的 _polish_
停用词列表。如果要使用其他预定义列表,请改用 stop
分词过滤器。
PUT /polish_stop_example { "settings": { "index": { "analysis": { "analyzer": { "analyzer_with_stop": { "tokenizer": "standard", "filter": [ "lowercase", "polish_stop" ] } }, "filter": { "polish_stop": { "type": "polish_stop", "stopwords": [ "_polish_", "jeść" ] } } } } } } GET polish_stop_example/_analyze { "analyzer": "analyzer_with_stop", "text": "Gdzie kucharek sześć, tam nie ma co jeść." }
以上请求返回
{ "tokens" : [ { "token" : "kucharek", "start_offset" : 6, "end_offset" : 14, "type" : "<ALPHANUM>", "position" : 1 }, { "token" : "sześć", "start_offset" : 15, "end_offset" : 20, "type" : "<ALPHANUM>", "position" : 2 } ] }