3

I'm looking into switching from solr to elasticsearch and have indexed a bunch of documents into it without providing a schema/mapping and a lot of the fields that i would have previously set as indexed strings in solr have been set as both text and keyword fields using multi-fields.

Is there any benifit to having a keyword field also as a text field using multi-fields? in my case most values in fields are single words so i'd imagine it wouldn't matter if they are sent to the analyzer but the es docs seem to imply that keyword fields are not considered when searching or at least treated differently?

Just to expand on that a little further if i search for the term "ipad" would a document score higher if it had "ipad" in a keyword field as well as some other text field vs the same document without the keyword field? and if say "ipad" was only in a keyword field would the document still match?

1 Answer 1

6

To answer my own question i created a quick test, pretty much keyword and text fields are equivalent when searching and multi-fields seem to get the same score as their primary type so i guess the second field has no effect on search scoring

Weirdly a multi word value in both keyword and text fields got the same score which i would have expecting the keyword field to score lower or not at all but for my purposes that is fine so i'm not going to investigate it further.

Index Creation

PUT test_index
{
    "settings" : {
        "number_of_shards" : 1
    },
    "mappings" : {
        "test_type" : {
            "properties" : {
                "multifield": {
                  "type": "text",
                  "fields": {
                     "keyword": {
                        "type": "keyword",
                        "ignore_above": 256
                     }
                  }
                },

                "keywordfield": {
                  "type": "keyword"
                },

                "textfield": {
                  "type": "text"
                }

            }
        }
    }
}

Data Insert

POST /_bulk
{ "update": { "_index": "test_index", "_type": "test_type", "_id": 1 }
{ "doc" : { "multifield" : "ipad"  }, "doc_as_upsert" : true }
{ "update": { "_index": "test_index", "_type": "test_type", "_id": 2 }
{ "doc" : { "keywordfield" : "ipad"  }, "doc_as_upsert" : true }
{ "update": { "_index": "test_index", "_type": "test_type", "_id": 3 }
{ "doc" : { "keywordfield" : "a green ipad"  }, "doc_as_upsert" : true }
{ "update": { "_index": "test_index", "_type": "test_type", "_id": 4 }
{ "doc" : { "textfield" : "a yellow ipad"  }, "doc_as_upsert" : true }
{ "update": { "_index": "test_index", "_type": "test_type", "_id": 5 }
{ "doc" : { "keywordfield" : "ipad", "textfield" : "ipad"  }, "doc_as_upsert" : true }
{ "update": { "_index": "test_index", "_type": "test_type", "_id": 6 }
{ "doc" : { "keywordfield" : "unrelated", "textfield" : "hopefully this wont show up"  }, "doc_as_upsert" : true }
{ "update": { "_index": "test_index", "_type": "test_type", "_id": 7 }
{ "doc" : { "textfield" : "ipad"  }, "doc_as_upsert" : true }

Results

GET /test_index/_search?q=ipad
{
   "took": 1,
   "timed_out": false,
   "_shards": {
      "total": 1,
      "successful": 1,
      "failed": 0
   },
   "hits": {
      "total": 6,
      "max_score": 0.28122374,
      "hits": [
         {
            "_index": "test_index",
            "_type": "test_type",
            "_id": "5",
            "_score": 0.28122374,
            "_source": {
               "keywordfield": "ipad",
               "textfield": "ipad"
            }
         },
         {
            "_index": "test_index",
            "_type": "test_type",
            "_id": "1",
            "_score": 0.2734406,
            "_source": {
               "multifield": "ipad"
            }
         },
         {
            "_index": "test_index",
            "_type": "test_type",
            "_id": "2",
            "_score": 0.2734406,
            "_source": {
               "keywordfield": "ipad"
            }
         },
         {
            "_index": "test_index",
            "_type": "test_type",
            "_id": "7",
            "_score": 0.2734406,
            "_source": {
               "textfield": "ipad"
            }
         },
         {
            "_index": "test_index",
            "_type": "test_type",
            "_id": "3",
            "_score": 0.16417998,
            "_source": {
               "keywordfield": "a green ipad"
            }
         },
         {
            "_index": "test_index",
            "_type": "test_type",
            "_id": "4",
            "_score": 0.16417998,
            "_source": {
               "textfield": "a yellow ipad"
            }
         }
      ]
   }
}
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.