1

I am working with the DSL for Elasticsearch in Python. My goal is to work with Elasticsearch response data in a loop as easily as possible using elasticsearch-dsl-py.

import datetime
import json
from elasticsearch import Elasticsearch
from elasticsearch_dsl import Search

e_search = Elasticsearch([{'host': 'my-alias', 'port': 5648}])

s = Search(using=e_search, index='sampleindex-2019.10') \
    .filter('range' ,  **{'@timestamp': {'gte': 1571633450000, 'lt': 1571669450000, 'format' : 'epoch_millis'}})

When I execute this I get the following values:

response = s.execute()
print(response.success())
>>> True
print(response.took)
>> 41
print(response.hits.total)
>> 6582

However, when I attempt to loop over all of the results it only seems to print out 10 hits:

for h in response:
    print(hit)
<Hit(sampleindex-2019.10/nQGt7G0BGh3E1MmaFw8e): {'startTime': '2019-10-21T13:57:05.621300916+09:00', 'header...'}>
<Hit(sampleindex-2019.10/egCp7G0BGh3E1Mmaq9bC): {'startTime': '2019-10-21T13:53:15.32923433+09:00', 'headers...'}>
<Hit(sampleindex-2019.10/hACo7G0BGh3E1MmaNsXk): {'headers': {'http_version': 'HTTP/1.1', 'http_user_agent': ...}>
<Hit(sampleindex-2019.10/VgCp7G0BGh3E1Mmae9Tv): {'headers': {'http_version': 'HTTP/1.1', 'http_user_agent': ...}>
<Hit(sampleindex-2019.10/nQGt7G0BGh3E1MmaFw8e): {'startTime': '2019-10-21T13:57:05.621300916+09:00', 'header...'}>
<Hit(sampleindex-2019.10/cwGv7G0BGh3E1Mma1Ddj): {'headers': {'http_version': 'HTTP/1.1', 'http_user_agent': ...}>
<Hit(sampleindex-2019.10/PgGv7G0BGh3E1MmaMzCA): {'startTime': '2019-10-21T13:59:11.83491578+09:00', 'headers...'}>
<Hit(sampleindex-2019.10/4wGw7G0BGh3E1MmaSjzb): {'headers': {'http_version': 'HTTP/1.1', 'http_user_agent': ...}>
<Hit(sampleindex-2019.10/cAGs7G0BGh3E1Mma_Q5Z): {'headers': {'http_version': 'HTTP/1.1', 'http_user_agent': ...}>
<Hit(sampleindex-2019.10/6AGw7G0BGh3E1Mma60OW): {'headers': {'http_version': 'HTTP/1.1', 'http_user_agent': ...}>

If I want to work with this output data and do something such as loop over the results and store info in a dictionary, how can I achieve as easily as possible with elasticsearch-dsl-py?

1 Answer 1

3

I found this excerpt in the GitHub docs (also at Read The Docs):

To specify the from/size parameters, use the Python slicing API:

s = s[10:20]

If you want to access all the documents matched by your query you can use the scan method which uses the scan/scroll elasticsearch API:

for hit in s.scan():
    print(hit.title)

Note that in this case the results won't be sorted.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.