In the previous chapter you have seen how to expand an Elasticsearch index with a dense_vector
field that is populated with embeddings generated by a Machine Learning model. The model was installed locally on your computer, and the embeddings were generated from the Python code and added to the documents before they were inserted into the index.
In this chapter you are going to learn about another vector type, the sparse_vector, which is designed to store inferences from the Elastic Learned Sparse EncodeR model (ELSER). Embeddings returned by this model are a collection of tags (more appropriately called features), each with an assigned weight.
In this chapter you will also use a different method for working with Machine Learning models, in which the Elasticsearch service itself runs the model and adds the resulting embeddings to the index through a pipeline.
The sparse_vector
Field
Like the dense_vector
field type you used in the previous chapter, the sparse_vector
type can store inferences returned by Machine Learning models. While dense vectors hold a fixed-length array of numbers that describe the source text, a sparse vector stores a mapping of features to weights.
Let's add a sparse_vector
field to the index. This is a type that needs to be defined explicitly in the index mapping. Below you can see an updated version of the create_index()
method with a new field called elser_embedding
with this type.
class Search:
# ...
def create_index(self):
self.es.indices.delete(index='my_documents', ignore_unavailable=True)
self.es.indices.create(index='my_documents', mappings={
'properties': {
'embedding': {
'type': 'dense_vector',
},
'elser_embedding': {
'type': 'sparse_vector',
},
}
})
# ...
Deploying the ELSER Model
As mentioned above, in this example Elasticsearch will take ownership of the model and automatically execute it to generate embeddings, both when inserting documents and when searching.
The Elasticsearch client exposes a set of API endpoints to manage Machine Learning models and their pipelines. The following deploy_elser()
method in search.py follows a few steps to download and install the ELSER v2 model, and to create a pipeline that uses it to populate the elser_embedding
field defined above.
class Search:
# ...
def deploy_elser(self):
# download ELSER v2
self.es.ml.put_trained_model(model_id='.elser_model_2',
input={'field_names': ['text_field']})
# wait until ready
while True:
status = self.es.ml.get_trained_models(model_id='.elser_model_2',
include='definition_status')
if status['trained_model_configs'][0]['fully_defined']:
# model is ready
break
time.sleep(1)
# deploy the model
self.es.ml.start_trained_model_deployment(model_id='.elser_model_2')
# define a pipeline
self.es.ingest.put_pipeline(
id='elser-ingest-pipeline',
processors=[
{
'inference': {
'model_id': '.elser_model_2',
'input_output': [
{
'input_field': 'summary',
'output_field': 'elser_embedding',
}
]
}
}
]
)
Configuring ELSER for us requires a several steps. First, the ml.put_trained_model()
method of the Elasticsearch is used to download ELSER. The model_id
argument identifies the model and version to download (ELSER v2 is available for Elasticsearch 8.11 and up). The input
field is the configuration required by this model.
Once the model is downloaded it needs to be deployed. For this, the ml.start_trained_model_deployment()
method is used, just with the identifier of the model to deploy. Note that this is an asynchronous operation, so the model is going to be available for use after a short amount of time.
The final step to configure the use of ELSER is to define a pipeline for it. A pipeline is used to tell Elasticsearch how the model has to be used. A pipeline is given an identifier and one or more processing tasks to perform. The pipeline created above is called elser-ingest-pipeline
and has a single inference task, which means that each time a document is added, the model is going to run with on the input_field
, and the output will be added to the document on the output_field
. For this example the summary
field is used to generate the embeddings, as with the dense vector embeddings in the previous chapter. The resulting embeddings are going to be written to the elser_embedding
sparse vector field created in the previous section.
To make it easy to invoke this method, add a deploy-elser
command to the Flask application in app.py:
@app.cli.command()
def deploy_elser():
"""Deploy the ELSER v2 model to Elasticsearch."""
try:
es.deploy_elser()
except Exception as exc:
print(f'Error: {exc}')
else:
print(f'ELSER model deployed.')
You can now deploy ELSER on your Elasticsearch service with the following command:
flask deploy-elser
The last configuration task involves linking the index with the pipeline, so that the model is automatically executed when documents are inserted on this index. This is done on the index configuration with a settings
option. Here is one more update to the create_index()
method to create this link:
class Search:
# ...
def create_index(self):
self.es.indices.delete(index='my_documents', ignore_unavailable=True)
self.es.indices.create(
index='my_documents',
mappings={
'properties': {
'embedding': {
'type': 'dense_vector',
},
'elser_embedding': {
'type': 'sparse_vector',
},
}
},
settings={
'index': {
'default_pipeline': 'elser-ingest-pipeline'
}
}
)
With this change, you can now regenerate the index with full support for ELSER inferences:
flask reindex
Previously
Semantic SearchNext
Semantic Queries