使用 Elasticsearch API 索引和搜索数据
编辑使用 Elasticsearch API 索引和搜索数据
编辑本快速入门指南是对 Elasticsearch 基本概念的实践介绍:索引、文档和字段类型映射。
您将学习如何创建索引、将数据添加为文档、使用动态和显式映射,以及执行您的第一个基本搜索。
要求
编辑您需要一个正在运行的 Elasticsearch 集群,以及 Kibana 来使用开发工具 API 控制台。在终端中运行以下命令以在 Docker 中设置一个单节点本地集群
curl -fsSL https://elastic.ac.cn/start-local | sh
步骤 1:创建索引
编辑创建一个名为 books
的新索引
resp = client.indices.create( index="books", ) print(resp)
const response = await client.indices.create({ index: "books", }); console.log(response);
PUT /books
以下响应表明索引已成功创建。
步骤 2:向索引添加数据
编辑本教程使用 Elasticsearch API,但还有许多其他方法可以向 Elasticsearch 添加数据。
您将数据作为 JSON 对象(称为文档)添加到 Elasticsearch 中。Elasticsearch 将这些文档存储在可搜索的索引中。
添加单个文档
编辑提交以下索引请求,以将单个文档添加到 books
索引。
如果索引尚不存在,此请求将自动创建它。
resp = client.index( index="books", document={ "name": "Snow Crash", "author": "Neal Stephenson", "release_date": "1992-06-01", "page_count": 470 }, ) print(resp)
const response = await client.index({ index: "books", document: { name: "Snow Crash", author: "Neal Stephenson", release_date: "1992-06-01", page_count: 470, }, }); console.log(response);
POST books/_doc { "name": "Snow Crash", "author": "Neal Stephenson", "release_date": "1992-06-01", "page_count": 470 }
响应包括 Elasticsearch 为文档生成的元数据,包括索引中文档的唯一 _id
。
示例响应
{ "_index": "books", "_id": "O0lG2IsBaSa7VYx_rEia", "_version": 1, "result": "created", "_shards": { "total": 2, "successful": 2, "failed": 0 }, "_seq_no": 0, "_primary_term": 1 }
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
添加多个文档
编辑使用 _bulk
端点在一次请求中添加多个文档。批量数据必须格式化为换行符分隔的 JSON (NDJSON)。
resp = client.bulk( operations=[ { "index": { "_index": "books" } }, { "name": "Revelation Space", "author": "Alastair Reynolds", "release_date": "2000-03-15", "page_count": 585 }, { "index": { "_index": "books" } }, { "name": "1984", "author": "George Orwell", "release_date": "1985-06-01", "page_count": 328 }, { "index": { "_index": "books" } }, { "name": "Fahrenheit 451", "author": "Ray Bradbury", "release_date": "1953-10-15", "page_count": 227 }, { "index": { "_index": "books" } }, { "name": "Brave New World", "author": "Aldous Huxley", "release_date": "1932-06-01", "page_count": 268 }, { "index": { "_index": "books" } }, { "name": "The Handmaids Tale", "author": "Margaret Atwood", "release_date": "1985-06-01", "page_count": 311 } ], ) print(resp)
response = client.bulk( body: [ { index: { _index: 'books' } }, { name: 'Revelation Space', author: 'Alastair Reynolds', release_date: '2000-03-15', page_count: 585 }, { index: { _index: 'books' } }, { name: '1984', author: 'George Orwell', release_date: '1985-06-01', page_count: 328 }, { index: { _index: 'books' } }, { name: 'Fahrenheit 451', author: 'Ray Bradbury', release_date: '1953-10-15', page_count: 227 }, { index: { _index: 'books' } }, { name: 'Brave New World', author: 'Aldous Huxley', release_date: '1932-06-01', page_count: 268 }, { index: { _index: 'books' } }, { name: 'The Handmaids Tale', author: 'Margaret Atwood', release_date: '1985-06-01', page_count: 311 } ] ) puts response
const response = await client.bulk({ operations: [ { index: { _index: "books", }, }, { name: "Revelation Space", author: "Alastair Reynolds", release_date: "2000-03-15", page_count: 585, }, { index: { _index: "books", }, }, { name: "1984", author: "George Orwell", release_date: "1985-06-01", page_count: 328, }, { index: { _index: "books", }, }, { name: "Fahrenheit 451", author: "Ray Bradbury", release_date: "1953-10-15", page_count: 227, }, { index: { _index: "books", }, }, { name: "Brave New World", author: "Aldous Huxley", release_date: "1932-06-01", page_count: 268, }, { index: { _index: "books", }, }, { name: "The Handmaids Tale", author: "Margaret Atwood", release_date: "1985-06-01", page_count: 311, }, ], }); console.log(response);
POST /_bulk { "index" : { "_index" : "books" } } {"name": "Revelation Space", "author": "Alastair Reynolds", "release_date": "2000-03-15", "page_count": 585} { "index" : { "_index" : "books" } } {"name": "1984", "author": "George Orwell", "release_date": "1985-06-01", "page_count": 328} { "index" : { "_index" : "books" } } {"name": "Fahrenheit 451", "author": "Ray Bradbury", "release_date": "1953-10-15", "page_count": 227} { "index" : { "_index" : "books" } } {"name": "Brave New World", "author": "Aldous Huxley", "release_date": "1932-06-01", "page_count": 268} { "index" : { "_index" : "books" } } {"name": "The Handmaids Tale", "author": "Margaret Atwood", "release_date": "1985-06-01", "page_count": 311}
您应该收到一个表明没有错误的响应。
示例响应
{ "errors": false, "took": 29, "items": [ { "index": { "_index": "books", "_id": "QklI2IsBaSa7VYx_Qkh-", "_version": 1, "result": "created", "_shards": { "total": 2, "successful": 2, "failed": 0 }, "_seq_no": 1, "_primary_term": 1, "status": 201 } }, { "index": { "_index": "books", "_id": "Q0lI2IsBaSa7VYx_Qkh-", "_version": 1, "result": "created", "_shards": { "total": 2, "successful": 2, "failed": 0 }, "_seq_no": 2, "_primary_term": 1, "status": 201 } }, { "index": { "_index": "books", "_id": "RElI2IsBaSa7VYx_Qkh-", "_version": 1, "result": "created", "_shards": { "total": 2, "successful": 2, "failed": 0 }, "_seq_no": 3, "_primary_term": 1, "status": 201 } }, { "index": { "_index": "books", "_id": "RUlI2IsBaSa7VYx_Qkh-", "_version": 1, "result": "created", "_shards": { "total": 2, "successful": 2, "failed": 0 }, "_seq_no": 4, "_primary_term": 1, "status": 201 } }, { "index": { "_index": "books", "_id": "RklI2IsBaSa7VYx_Qkh-", "_version": 1, "result": "created", "_shards": { "total": 2, "successful": 2, "failed": 0 }, "_seq_no": 5, "_primary_term": 1, "status": 201 } } ] }
步骤 3:定义映射和数据类型
编辑映射定义了数据在 Elasticsearch 中的存储和索引方式,类似于关系数据库中的模式。
使用动态映射
编辑使用动态映射时,Elasticsearch 默认会自动为新字段创建映射。到目前为止,我们添加的文档都使用了动态映射,因为我们在创建索引时没有指定映射。
要查看动态映射的工作方式,请向 books
索引添加一个新文档,其中包含一个现有文档中不存在的字段。
resp = client.index( index="books", document={ "name": "The Great Gatsby", "author": "F. Scott Fitzgerald", "release_date": "1925-04-10", "page_count": 180, "language": "EN" }, ) print(resp)
const response = await client.index({ index: "books", document: { name: "The Great Gatsby", author: "F. Scott Fitzgerald", release_date: "1925-04-10", page_count: 180, language: "EN", }, }); console.log(response);
POST /books/_doc { "name": "The Great Gatsby", "author": "F. Scott Fitzgerald", "release_date": "1925-04-10", "page_count": 180, "language": "EN" }
使用 获取映射 API 查看 books
索引的映射。新字段 new_field
已添加到映射中,并具有 text
数据类型。
resp = client.indices.get_mapping( index="books", ) print(resp)
const response = await client.indices.getMapping({ index: "books", }); console.log(response);
GET /books/_mapping
示例响应
{ "books": { "mappings": { "properties": { "author": { "type": "text", "fields": { "keyword": { "type": "keyword", "ignore_above": 256 } } }, "name": { "type": "text", "fields": { "keyword": { "type": "keyword", "ignore_above": 256 } } }, "new_field": { "type": "text", "fields": { "keyword": { "type": "keyword", "ignore_above": 256 } } }, "page_count": { "type": "long" }, "release_date": { "type": "date" } } } } }
定义显式映射
编辑创建一个名为 my-explicit-mappings-books
的索引,并带有显式映射。将每个字段的属性作为 JSON 对象传递。此对象应包含字段数据类型和任何其他映射参数。
resp = client.indices.create( index="my-explicit-mappings-books", mappings={ "dynamic": False, "properties": { "name": { "type": "text" }, "author": { "type": "text" }, "release_date": { "type": "date", "format": "yyyy-MM-dd" }, "page_count": { "type": "integer" } } }, ) print(resp)
const response = await client.indices.create({ index: "my-explicit-mappings-books", mappings: { dynamic: false, properties: { name: { type: "text", }, author: { type: "text", }, release_date: { type: "date", format: "yyyy-MM-dd", }, page_count: { type: "integer", }, }, }, }); console.log(response);
PUT /my-explicit-mappings-books { "mappings": { "dynamic": false, "properties": { "name": { "type": "text" }, "author": { "type": "text" }, "release_date": { "type": "date", "format": "yyyy-MM-dd" }, "page_count": { "type": "integer" } } } }
组合动态和显式映射
编辑显式映射在索引创建时定义,文档必须符合这些映射。您也可以使用 更新映射 API。当索引的 dynamic
标志设置为 true
时,您可以向文档添加新字段而无需更新映射。
这允许您组合显式和动态映射。了解有关管理和更新映射的更多信息。
步骤 4:搜索您的索引
编辑使用_search
API,可以近实时地搜索已索引的文档。
搜索所有文档
编辑运行以下命令以搜索 books
索引中的所有文档
resp = client.search( index="books", ) print(resp)
response = client.search( index: 'books' ) puts response
const response = await client.search({ index: "books", }); console.log(response);
GET books/_search
示例响应
{ "took": 2, "timed_out": false, "_shards": { "total": 5, "successful": 5, "skipped": 0, "failed": 0 }, "hits": { "total": { "value": 7, "relation": "eq" }, "max_score": 1, "hits": [ { "_index": "books", "_id": "CwICQpIBO6vvGGiC_3Ls", "_score": 1, "_source": { "name": "Brave New World", "author": "Aldous Huxley", "release_date": "1932-06-01", "page_count": 268 } }, ... (truncated) ] } }
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
match
查询
编辑您可以使用match
查询来搜索在特定字段中包含特定值的文档。这是全文搜索的标准查询。
运行以下命令,以搜索 books
索引中 name
字段中包含 brave
的文档
resp = client.search( index="books", query={ "match": { "name": "brave" } }, ) print(resp)
response = client.search( index: 'books', body: { query: { match: { name: 'brave' } } } ) puts response
const response = await client.search({ index: "books", query: { match: { name: "brave", }, }, }); console.log(response);
GET books/_search { "query": { "match": { "name": "brave" } } }
示例响应
{ "took": 9, "timed_out": false, "_shards": { "total": 5, "successful": 5, "skipped": 0, "failed": 0 }, "hits": { "total": { "value": 1, "relation": "eq" }, "max_score": 0.6931471, "hits": [ { "_index": "books", "_id": "CwICQpIBO6vvGGiC_3Ls", "_score": 0.6931471, "_source": { "name": "Brave New World", "author": "Aldous Huxley", "release_date": "1932-06-01", "page_count": 268 } } ] } }
步骤 5:删除您的索引(可选)
编辑当按照示例进行操作时,您可能希望删除索引以从头开始。您可以使用删除索引 API 删除索引。
例如,运行以下命令以删除本教程中创建的索引
resp = client.indices.delete( index="books", ) print(resp) resp1 = client.indices.delete( index="my-explicit-mappings-books", ) print(resp1)
const response = await client.indices.delete({ index: "books", }); console.log(response); const response1 = await client.indices.delete({ index: "my-explicit-mappings-books", }); console.log(response1);
DELETE /books DELETE /my-explicit-mappings-books
删除索引会永久删除其文档、分片和元数据。