Reduce number of shards in ElasticSearch

Question

Is it possible to reduce the number of shards in ElasticSearch search engine once the index is created ?

I tried :

$ curl -XPUT 'localhost:9200/myindex/_settings' -d '{"index" : {"number_of_shards" : 3}}'

But it gives an error :

{"error":"ElasticsearchIllegalArgumentException[can't change the number of shards for an index]","status":400}

@eliasah On my development server : Version: 1.4.4, Build: c88f77f/2015-02-19T13:05:36Z, JVM: 1.7.0_75 — Fedir RYKHTIK
– Fedir RYKHTIK, Commented Apr 30, 2015 at 8:47
Your only option is to create new index with less shards and reindex all data from old index to the new one with tool like stream2es — Konstantin V. Salikhov
– Konstantin V. Salikhov, Commented Apr 30, 2015 at 9:09

stzov · Accepted Answer · 2024-08-25 13:32:34Z

3

This is no longer true, with 5.x you can shrink the number of indexes to a whole fraction. For example from 12 you could go down to 1, 2, 3 or 6 (see the docs). But you must put it to read-only mode and naturally the shrinking process requires lots of IO.

Alternatively, since version 2.3 you could use the reindex API, which would allow you to change to any number of shards. Reindex would need much more resources than the shrink API, because it would have to go through the process of indexing each document from scratch.

edited Aug 25, 2024 at 13:32

stzov

3061 gold badge2 silver badges12 bronze badges

answered May 2, 2017 at 18:11

NikoNyrh

4,2084 gold badges21 silver badges38 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Mysterion · Accepted Answer · 2015-04-30 08:23:07Z

1

No, it's not possible. You could change a lot of stuff - e.g. number of replicas for each shard, or many other index settings, but not the number of shards.

For more information - take a look here - http://www.elastic.co/guide/en/elasticsearch/reference/1.5/indices-update-settings.html

answered Apr 30, 2015 at 8:23

Mysterion

9,3303 gold badges33 silver badges54 bronze badges

Comments

Community · Accepted Answer · 2020-06-20 09:12:55Z

1

Ok. Like @Mysterion said, it's not possible to change the number of shards with zero-downtime directly with an index update. But there is another way around.

You'll be needing to re-index your old index into an new index after creating it with the desired number of shards. (Like I said no zero-downtime)

For that you can use the Scroll Search API :

While a search request returns a single “page” of results, the scroll API can be used to retrieve large numbers of results (or even all results) from a single search request, in much the same way as you would use a cursor on a traditional database.

Scrolling is not intended for real time user requests, but rather for processing large amounts of data, e.g. in order to reindex the contents of one index into a new index with a different configuration.

Client support for scrolling and reindexing : Some of the officially supported clients provide helpers to assist with scrolled searches and reindexing of documents from one index to another:

Perl See Search::Elasticsearch::Bulk and Search::Elasticsearch::Scroll

Python See elasticsearch.helpers.*

For more information about the Scroll Search API, I suggest the official documentation

And you might also want to take a look at this answer here, maybe it can also give you some ideas in case you are using Java.

edited Jun 20, 2020 at 9:12

CommunityBot

11 silver badge

answered Apr 30, 2015 at 9:13

eliasah

40.5k12 gold badges128 silver badges159 bronze badges

3 Comments

higuita Over a year ago

You can use github.com/taskrabbit/elasticsearch-dump to copy the data to a new index with the correct number of shards and then remove the old one. that program makes easier than trying to use directly the scroll search api

eliasah Over a year ago

It's a very good project but when you have tens of millions of documents in your index, it's extremely slow. I've benchmarked it against optimized scan and scroll with the official python API and it the later runs 10 times faster (at least). Thus I'll let you judge ;-)

higuita Over a year ago

you need to increase the batch size, the default of 100 make it slower. Also, my first run took about 2 hours, the second run took 35min... so indexes caches make a huge differences

Collectives™ on Stack Overflow

Reduce number of shards in ElasticSearch

3 Answers 3

Comments

Comments

3 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

3 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related