2

I'm looking at Elasticsearch for the first time and spent around a day looking at it. We already use Lucene extensively and want to start using ES instead. I'm looking at alternative data structures to what we currently have.

If I run *match_all* query this is what I get at the moment. I am happy with this structure.

{
   "took": 2,
   "timed_out": false,
   "_shards": {
      "total": 5,
      "successful": 5,
      "failed": 0
   },
   "hits": {
      "total": 22,
      "max_score": 1,
      "hits": [
         {
            "_index": "integration-test-static",
            "_type": "sport",
            "_id": "4d38e07b-f3d3-4af2-9221-60450b18264a",
            "_score": 1,
            "_source": {
               "Descriptions": [
                  {
                     "FeedSource": "dde58b3b-145b-4864-9f7c-43c64c2fe815",
                     "Value": "Football"
                  },
                  {
                     "FeedSource": "e4b9ad44-00d7-4216-adf5-3a37eafc4c93",
                     "Value": "Football"
                  }
               ],
               "Synonyms": [
                  "Football"
               ]
            }
         }
      ]
   }
}

What I can't figure out is how a query is written to pull back this document by searching for the synonym "Football". Looks like it should be easy!

I got this approach after reading this: http://gibrown.wordpress.com/2013/01/24/elasticsearch-five-things-i-was-doing-wrong/ He mentions storing multiple fields in arrays. I realise my example does not have multiple fields, but we will certainly be looking for a solution which can cater for them.

Tried various different queries with filters, bool things, term this and terms that, none return.

2 Answers 2

5

What does your search and mappings look like?

If you let Elasticsearch generate the mapping, it'll use the standard analyzer which lowercases the text (and removes stopwords).

So Football will actually be indexed as football. The term-family of queries/filters do not do text analysis, so term:Football will be looking for Football, which is not indexed. The match-family of queries do.

This is a very common problem, and is covered quite extensively in my article on Troubleshooting Elasticsearch searches, for Beginners, which can be worth skimming through. Text analysis is a very important part of working with search, so there's some more articles about it as well.

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks for a prompt reply (and Akshay). I will certainly look through the beginners guide. A few days and it'll all start sinking in...
2

A simple match query would work in this scenario.

POST integration-test-static/_search
{
    "query": {
        "match": {
           "Synonyms": "Football"
        }
    }
}

Which returns:

{
   "took": 0,
   "timed_out": false,
   "_shards": {
      "total": 1,
      "successful": 1,
      "failed": 0
   },
   "hits": {
      "total": 1,
      "max_score": 0.30685282,
      "hits": [
         {
            "_index": "integration-test-static",
            "_type": "sport",
            "_id": "4d38e07b-f3d3-4af2-9221-60450b18264a",
            "_score": 0.30685282,
            "_source": {
               "Descriptions": [
                  {
                     "FeedSource": "dde58b3b-145b-4864-9f7c-43c64c2fe815",
                     "Value": "Football"
                  },
                  {
                     "FeedSource": "e4b9ad44-00d7-4216-adf5-3a37eafc4c93",
                     "Value": "Football"
                  }
               ],
               "Synonyms": [
                  "Football"
               ]
            }
         }
      ]
   }
}

3 Comments

Thanks I also confirm that this works. Could you possibly show how I might find "Football" only. To avoid returning "American Football" for example?
As Alex says, there's a lot you can do with analyzers to help you with your indexing and queries. A simple one to look at is the "keyword" analyzer. That will convert everything to lower case, but treat the entire string as a single word and not split it up. So a search for the term:football will not match term:"american football"
The keyword analyzer was exactly what I was looking for here.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.