1

I have an index of documents with the following (simplified) structure.

{
  "product_id": "abc123",
  "properties": [
    {
      "key": "width",
      "value": 1000
    },
    {
      "key": "height",
      "value": 2000
    },
    {
      "key": "depth",
      "value": 500
    }
  ]
}

Each document can have hundreds of properties.

Now - i want to be able to search for documents matching a query, and also specify which properties each document should be populated with when returned. So basically i want to write the following request:

Get me all documents that match query x, and populate each document with the properties ["height", "width", "foobar" ].

The array with the properties I want to return is created at query time based on input from the user. The document in the response to the query would look like this:

{
  "product_id": "abc123",
  "properties": [
    {
      "key": "width",
      "value": 1000
    },
    {
      "key": "height",
      "value": 2000
    }
    // No depth!
  ]
}

I have tried to achieve this through source filtering to no avail. I suspect script fields might be the only way to solve this, but I would rather use some standard way. Anyone got any ideas?

1 Answer 1

2

The best that I can think of is to use inner_hits. For example:

PUT proptest
{
  "mappings": {
    "default": {
      "properties": {
        "product_id": {
          "type": "keyword"
        },
        "color": {
          "type": "keyword"
        },
        "props": {
          "type": "nested"
        }
      }
    }
  }
}

PUT proptest/default/1
{
  "product_id": "abc123",
  "color": "red",
  "props": [
    {
      "key": "width",
      "value": 1000
    },
    {
      "key": "height",
      "value": 2000
    },
    {
      "key": "depth",
      "value": 500
    }
  ]
}
PUT proptest/default/2
{
  "product_id": "def",
  "color": "red",
  "props": [
  ]
}
PUT proptest/default/3
{
  "product_id": "ghi",
  "color": "blue",
  "props": [
    {
      "key": "width",
      "value": 1000
    },
    {
      "key": "height",
      "value": 2000
    },
    {
      "key": "depth",
      "value": 500
    }
  ]
}

Now we can query by color and fetch only the height, depth and foobar properties:

GET proptest/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "term": {
            "color": {
              "value": "red"
            }
          }
        },
        {
          "bool": {
            "should": [
              {
                "nested": {
                  "path": "props",
                  "query": {
                    "match": {
                      "props.key": "height depth foobar"
                    }
                  },
                  "inner_hits": {}
                }
              },
              {
                "match_all": {}
              }
            ]
          }
        }

      ]
    }
  },
 "_source": {
   "excludes": "props"
 }
} 

The output is

{
  "hits": {
    "total": 2,
    "max_score": 2.2685113,
    "hits": [
      {
        "_index": "proptest",
        "_type": "default",
        "_id": "1",
        "_score": 2.2685113,
        "_source": {
          "color": "red",
          "product_id": "abc123"
        },
        "inner_hits": {
          "props": {
            "hits": {
              "total": 2,
              "max_score": 0.9808292,
              "hits": [
                {
                  "_index": "proptest",
                  "_type": "default",
                  "_id": "1",
                  "_nested": {
                    "field": "props",
                    "offset": 2
                  },
                  "_score": 0.9808292,
                  "_source": {
                    "key": "depth",
                    "value": 500
                  }
                },
                {
                  "_index": "proptest",
                  "_type": "default",
                  "_id": "1",
                  "_nested": {
                    "field": "props",
                    "offset": 1
                  },
                  "_score": 0.9808292,
                  "_source": {
                    "key": "height",
                    "value": 2000
                  }
                }
              ]
            }
          }
        }
      },
      {
        "_index": "proptest",
        "_type": "default",
        "_id": "2",
        "_score": 1.287682,
        "_source": {
          "color": "red",
          "product_id": "def"
        },
        "inner_hits": {
          "props": {
            "hits": {
              "total": 0,
              "max_score": null,
              "hits": []
            }
          }
        }
      }
    ]
  }
}

Note that the results contains both products abc123 and def with the correct properties filtered. Product abc123 matches partially with the given property list, def does not contain any of them. The main results are defined only by the outer query color:red

The drawback of the method is the properties won't be found under the same top level _source but under the inner hits key.

Sign up to request clarification or add additional context in comments.

5 Comments

Thank you! I've been examining this approach as well, but unfortunately it doesn't satisfy one of the requirements, the "foobar" in my example. I want to say "only include properties with keys x, y or z", but i don't want to say "any hits have to contain properties with x, y and z". Properties with just x and y should also be considered to be hits.
@Bulgur What is "foobar" exactly? Is it a property that is missing from the document?
yes it is missing from that particular document but might exist in other documents. Basically, I don't want the main documents filtered based on which properties they have. I only want to filter which properties that are returned for each matching document to reduce the size of the response. So the input is an array of property keys. And all properties with matching keys are returned in each document. But they do not all have to exist in each document.
@Bulgur Improved the answer. The original version was working for partial matches but not if none of the properties matched.
Exactly what I needed. Thank you very much! I ended up using a terms query in the inner_hits bit instead of a match though. The match didn't work, but the terms did. "query": { "terms" : { "props.key" : ["width", "foobar"]} }

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.