1

We are using the mass indexer to create the index itself and the documents. We have our own versioning and the idea is to completely delete all documents and index configs when there is a new version and let mass indexer do its job.

The problem is that, differently from using Lucene directly, when we try to purge all the indexes, it only deletes the documents, it doesn't delete the index config, which ends up in conflicts when trying to modify an existing field.

Is there a way to do this through Hibernate Search or do we need to use the DELETE method on our cluster directly?

1 Answer 1

2

I assume you run this mass indexing job as part of your migration procedure from one version of your application to the next. If so, in the one-time job that triggers reindexing, you could start Hibernate with the following setting:

hibernate.search.default.elasticsearch.index_schema_management_strategy drop-and-create

Then, on startup Hibernate Search will drop the index completely, along with its mapping, and recreate it.

Be careful though, this is fine only if you execute the mass indexing in a dedicated program. Your application should probably not be started with this setting, as it will cause it to drop the indexes on every startup (e.g. if your had to restart your server for whatever reason).

Source: the official documentation

On a side note:

The problem is that, differently from using Lucene directly, when we try to purge all the indexes, it only deletes the documents, it doesn't delete the index config, which ends up in conflicts when trying to modify an existing field.

This is actually the same behavior as the Lucene integration: when you purge a Lucene index, the index file stays the same, it's just that any content has been removed. The main difference is that Elasticsearch has some index metadata (the mapping), while "raw" Lucene doesn't.

However, it is true that the current behavior is rather annoying in your case. We will try to address it in the future, probably as part of HSEARCH-2861.

Sign up to request clarification or add additional context in comments.

2 Comments

The problem with the index_schema_management_strategy setting is that sometimes we need to reindex only one entity. Also, we have multiple nodes running and one is elected the leader to do such procedures, it would be very hard (if not impossible) to change the setting on the fly. I'll create a REST call to the ES cluster to do this for now. Thanks for the informative answer!
@andrehil did you ever find a solution to this issue? We are running into the same thing. It seems like bypassing hibernate search, and hitting the ES cluster with a DELETE rest call introduces some strange behavior when then attempting to reindex using the MassIndexer.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.