How to get multiple fields returned in elasticsearch query?

Question

How to get multiple fields returned that are unique using elasticsearch query?

All of my documents have duplicate name and job fields. I would like to use an es query to get all the unique values which include the name and job in the same response, so they are tied together.

[
{
    "name": "albert",
    "job": "teacher",
    "dob": "11/22/91"
},
{
    "name": "albert",
    "job": "teacher",
    "dob": "11/22/91"
},
{
    "name": "albert",
    "job": "teacher",
    "dob": "11/22/91"
},
{
    "name": "justin",
    "job": "engineer",
    "dob": "1/2/93"
},
{
    "name": "justin",
    "job": "engineer",
    "dob": "1/2/93"
},
{
    "name": "luffy",
    "job": "rubber man",
    "dob": "1/2/99"
}
]

Expected result in any format -> I was trying to use aggs but I only get one field

[
    {
        "name": "albert",
        "job": "teacher"
    },
    {
        "name": "justin",
        "job": "engineer"
    },
    {
        "name": "luffy",
        "job": "rubber man"
    },

]

This is what I tried so far

GET name.test.index/_search
{
  "size": 0,
    "aggs" : {
      "name" : {
        "terms" : { "field" : "name.keyword" }
      }
    }
}

using the above query gets me this which is good that its unique

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 95,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "name" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : "Justin",
          "doc_count" : 56
        },
        {
          "key" : "Luffy",
          "doc_count" : 31
        },
        {
          "key" : "Albert",
          "doc_count" : 8
        }
      ]
    }
  }
}

I tried doing nested aggregation but that did not work. Is there an alternative solution for getting multiple unique values or am I missing something?

Val · Accepted Answer · 2020-06-03 15:26:45Z

That's a good start! There are a few ways to achieve what you want, each provides a different response format, so you can decide which one you prefer.

The first option is to leverage the top_hits sub-aggregation and return the two fields for each name bucket:

GET name.test.index/_search
{
  "size": 0,
  "aggs": {
    "name": {
      "terms": {
        "field": "name.keyword"
      },
      "aggs": {
        "top": {
          "top_hits": {
            "_source": [
              "name",
              "job"
            ],
            "size": 1
          }
        }
      }
    }
  }
}

The second option is to use a script in your terms aggregation instead of a field to return a compound value:

GET name.test.index/_search
{
  "size": 0,
  "aggs": {
    "name": {
      "terms": {
        "script": "doc['name'].value + ' - ' + doc['job'].value"
      }
    }
  }
}

The third option is to use two levels of field collapsing:

GET name.test.index/_search
{
  "collapse": {
    "field": "name",
    "inner_hits": {
      "name": "by_job",
      "collapse": {
        "field": "job"
      },
      "size": 1
    }
  }
}

Collectives™ on Stack Overflow

How to get multiple fields returned in elasticsearch query?

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related