1

I'm trying to store the output of CoreNLP in elasticsearch. I'm using one of the Python wrappers for CoreNLP for this. The output of the wrapper is a fairly large object, that eventually contains lists similar to this:

['word', {tag: 'abc', index: 2, ...}]

That is, a list of both a string and a dictionary.

When I try to index this in Elasticsearch, I get a MapperParsingException. As explained in this question, this is probably because my list contains different types, and Elasticsearch likes arrays of one type.

Is there a way to convince Elasticsearch to map this kind of data? I can turn the list into a dictionary before storing it, but that would require me to convert it back into a list when reading (there's a lot of other code that uses this data), and I'd rather not do that.

3
  • And what stops you to use the suggestion from that SO post? You can sidestep this issue by explicitly declaring data as an object and setting enabled: false, but this is probably not the solution you want (since that just tells ES to store data as a text field, with no parsing. Commented Jul 7, 2015 at 7:10
  • Because I do want the nested object to be parsed and indexed. Commented Jul 7, 2015 at 10:56
  • That's not possible. You can't have different types in the same field, basically. Commented Jul 7, 2015 at 11:02

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.