测试分析器
编辑测试分析器编辑
analyze
API 是一个用于查看分析器生成的词项的宝贵工具。可以在请求中内联指定内置分析器
response = client.indices.analyze( body: { analyzer: 'whitespace', text: 'The quick brown fox.' } ) puts response
POST _analyze { "analyzer": "whitespace", "text": "The quick brown fox." }
API 返回以下响应
{ "tokens": [ { "token": "The", "start_offset": 0, "end_offset": 3, "type": "word", "position": 0 }, { "token": "quick", "start_offset": 4, "end_offset": 9, "type": "word", "position": 1 }, { "token": "brown", "start_offset": 10, "end_offset": 15, "type": "word", "position": 2 }, { "token": "fox.", "start_offset": 16, "end_offset": 20, "type": "word", "position": 3 } ] }
您还可以测试以下各项的组合
- 分词器
- 零个或多个词元过滤器
- 零个或多个字符过滤器
response = client.indices.analyze( body: { tokenizer: 'standard', filter: [ 'lowercase', 'asciifolding' ], text: 'Is this déja vu?' } ) puts response
POST _analyze { "tokenizer": "standard", "filter": [ "lowercase", "asciifolding" ], "text": "Is this déja vu?" }
API 返回以下响应
{ "tokens": [ { "token": "is", "start_offset": 0, "end_offset": 2, "type": "<ALPHANUM>", "position": 0 }, { "token": "this", "start_offset": 3, "end_offset": 7, "type": "<ALPHANUM>", "position": 1 }, { "token": "deja", "start_offset": 8, "end_offset": 12, "type": "<ALPHANUM>", "position": 2 }, { "token": "vu", "start_offset": 13, "end_offset": 15, "type": "<ALPHANUM>", "position": 3 } ] }
或者,在对特定索引运行 analyze
API 时,可以引用 自定义
分析器
response = client.indices.create( index: 'my-index-000001', body: { settings: { analysis: { analyzer: { std_folded: { type: 'custom', tokenizer: 'standard', filter: [ 'lowercase', 'asciifolding' ] } } } }, mappings: { properties: { my_text: { type: 'text', analyzer: 'std_folded' } } } } ) puts response response = client.indices.analyze( index: 'my-index-000001', body: { analyzer: 'std_folded', text: 'Is this déjà vu?' } ) puts response response = client.indices.analyze( index: 'my-index-000001', body: { field: 'my_text', text: 'Is this déjà vu?' } ) puts response
PUT my-index-000001 { "settings": { "analysis": { "analyzer": { "std_folded": { "type": "custom", "tokenizer": "standard", "filter": [ "lowercase", "asciifolding" ] } } } }, "mappings": { "properties": { "my_text": { "type": "text", "analyzer": "std_folded" } } } } GET my-index-000001/_analyze { "analyzer": "std_folded", "text": "Is this déjà vu?" } GET my-index-000001/_analyze { "field": "my_text", "text": "Is this déjà vu?" }
API 返回以下响应
{ "tokens": [ { "token": "is", "start_offset": 0, "end_offset": 2, "type": "<ALPHANUM>", "position": 0 }, { "token": "this", "start_offset": 3, "end_offset": 7, "type": "<ALPHANUM>", "position": 1 }, { "token": "deja", "start_offset": 8, "end_offset": 12, "type": "<ALPHANUM>", "position": 2 }, { "token": "vu", "start_offset": 13, "end_offset": 15, "type": "<ALPHANUM>", "position": 3 } ] }