1

I have a query like this:

xml_db.find(
    {
        'high_performer': {
            '$nin': [some_value]
        },
        'low_performer': {
            '$nin': [some_value]
        },
        'expiration_date': {
            '$gte': datetime.now().strftime('%Y-%m-%d')
        },
        'source': 'some_value'        
    }
)

I have tried to create an index with those fields but getting error:

pymongo.errors.OperationFailure: cannot index parallel arrays [low_performer] [high_performer]

So, how to efficiently run this query?

0

1 Answer 1

1

Compound indexing ordering should follow the equality --> sort --> range rule. A good description of this can be found in this response.

This means that the first field in the index would be source, followed by the range filters (expiration_date, low_performer and high_performer).

As you noticed, one of the "performer" fields cannot be included in the index since only a single array can be indexed. You should use your knowledge of the data set to determine which filter (low_performer or high_performer) would be more selective and choose that filter to be included in the index.

Assuming that high_performer is more selective, the only remaining step would be to determine the ordering between expiration_date and high_performer. Again, you should use your knowledge of the data set to make this determination based on selectivity.

Assuming expiration_date is more selective, the index to create would then be:

{ "source" : 1, "expiration_date" : 1, "high_performer" : 1 }
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks Adam, it clarifies a lot of confusion of mine regarding effective indexing.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.