0

I've documents in elastic search and I want to filter out the documents that contain an array of only empty strings or have nothing / empty array.

#doc 1
{
  "_index": "my-index-000001",
  "_type": "_doc",
  "_id": "0",
  "_source": {
    "doc":{
        "field": ["",""]
    }
  }
}

#doc 2
{
  "_index": "my-index-000001",
  "_type": "_doc",
  "_id": "0",
  "_source": {
    "doc":{
        "field": []
    }
  }
}

#doc 3
{
  "_index": "my-index-000001",
  "_type": "_doc",
  "_id": "0",
  "_source": {
    "doc":{
        "field": ["hello",""]
    }
  }
}

From the above documents is it possible to filter out only doc 1 and doc 2 as for these, the "field" either contains nothing in the array or only empty string(s).

1 Answer 1

1

Please check below query which will return only the document which have empty array or an array with all the empty string.

here first should clause will check if empty string is part of array or not, second clause will check if array field does not exist and must_not with wildcard will remove document from result which have atleast one element in array.

{
  "query": {
    "bool": {
      "should": [
        {
          "term": {
            "city.keyword": {
              "value": ""
            }
          }
        },
        {
          "bool": {
            "must_not": [
              {
                "exists": {
                  "field": "city.keyword"
                }
              }
            ]
          }
        }
      ],
      "must_not": [
        {
          "wildcard": {
            "city.keyword": "?*"
          }
        }
      ]
    }
  }
}

Below is sample document in my index :

{
"hits" : [
      {
        "_index" : "arrayindex",
        "_type" : "_doc",
        "_id" : "4g3P2H4BrzeQ9ErqJwUL",
        "_score" : 1.0,
        "_source" : {
          "city" : [
            "",
            ""
          ]
        }
      },
      {
        "_index" : "arrayindex",
        "_type" : "_doc",
        "_id" : "4w3P2H4BrzeQ9ErqXgWT",
        "_score" : 1.0,
        "_source" : {
          "city" : [ ]
        }
      },
      {
        "_index" : "arrayindex",
        "_type" : "_doc",
        "_id" : "5A3P2H4BrzeQ9ErqhwUI",
        "_score" : 1.0,
        "_source" : {
          "city" : [
            "hello",
            ""
          ]
        }
      },
      {
        "_index" : "arrayindex",
        "_type" : "_doc",
        "_id" : "5Q3q2H4BrzeQ9ErqOAXW",
        "_score" : 1.0,
        "_source" : {
          "city" : [
            "hello",
            "sagar"
          ]
        }
      }
    ]
}

Sample output after executing above query:

{
"hits" : [
      {
        "_index" : "arrayindex",
        "_type" : "_doc",
        "_id" : "4g3P2H4BrzeQ9ErqJwUL",
        "_score" : 0.5619608,
        "_source" : {
          "city" : [
            "",
            ""
          ]
        }
      },
      {
        "_index" : "arrayindex",
        "_type" : "_doc",
        "_id" : "4w3P2H4BrzeQ9ErqXgWT",
        "_score" : 0.0,
        "_source" : {
          "city" : [ ]
        }
      }
    ]
}
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.