1

I am working on AWS ElasticSearch using python,I have JSON file with 3 field.

("cat1","Cat2","cat3"), each row is separated with \n 
example  cat1:food, cat2: wine, cat3: lunch etc.

from requests_aws4auth import AWS4Auth
import boto3
import requests
    payload = {

  "settings": {
    "number_of_shards": 10,
    "number_of_replicas": 5
  },
  "mappings": { 
      "Categoryall" :{
        "properties" : {
          "cat1" : {
            "type": "string"
        },
          "Cat2":{
            "type" : "string"
        },
          "cat3" : {
            "type" : "string"
        }

      }    
    }
  } 
}

r = requests.put(url, auth=awsauth, json=payload)

I created schema/mapping for the index as shown above but i don't know how to populate index. I am thinking to put a for loop for JSON file and call post request to insert the index. Doesn't have an idea how to proceed.

I want to create index and bulk upload this file in the index. Any suggestion would be appreciated.

1 Answer 1

1

Take a look at Elasticsearch Bulk API.

Basically, you need to create a bulk request body and post it to your "https://{elastic-endpoint}/_bulk" url.

The following example is showing a bulk request to insert 3 json records into your index called "my_index":

{ "index" : { "_index" : "my_index", "_type" : "_doc", "_id" : "1" } }
{ "cat1" : "food 1", "cat2": "wine 1", "cat3": "lunch 1" }
{ "index" : { "_index" : "my_index", "_type" : "_doc", "_id" : "2" } }
{ "cat1" : "food 2", "cat2": "wine 2", "cat3": "lunch 2" }
{ "index" : { "_index" : "my_index", "_type" : "_doc", "_id" : "3" } }
{ "cat1" : "food 3", "cat2": "wine 3", "cat3": "lunch 3" }

where each json record is represented by 2 json objects.

So if you write your bulk request body into a file called post-data.txt, then you can post it using Python something like this:

with open('post-data.txt','rb') as payload:
    r = requests.post('https://your-elastic-endpoint/_bulk', auth=awsauth,
                      data=payload, ... add more params)

Alternatively, you can try Python elasticsearch bulk helpers.

Sign up to request clarification or add additional context in comments.

2 Comments

Thanks for the reply, but how do you do it using python,
request.post(url,auth=awsauth,json = 'my-index/_bulk?pretty --data-binary @index_json.json')

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.