0 votes
1 view
in AWS by (18.2k points)

I am working on indexing a large data set which has multiple name fields for a particular entity. I have defined the name field of type array and I am adding around 4 names in that. Some of the names have spaces in between and they are getting tokenized. Can I avoid that?

I know for String we have text as well as keyword type in Elastic but how do I define the type as a keyword when I am having an array as my data type? By default, all the array fields are taken as text type. I want them to be treated as keyword type so they don't get tokenized while indexing.

Expected: If I store "Hello World" in an array, I should be able to search "Hello World".

Current behaviour: It stores hello differently and world differently as it tokenizes that.

1 Answer

0 votes
by (42.4k points)

There is no datatype “Array” in elasticsearch. Let me explain the problem with examples:

Consider you have created this property below:

{

   "tagIds": {

      "type": "integer"

   }

}

Then values are indexed with it:

{

   "tagIds": [124, 452, 234]

}

Then tagIds automatically become an array of integers.

Now coming to your case. You have to create a field, such as “Name” with the type as a keyword. Also, always pass an array to this field even though if it has to store only one value, still make sure to keep it as an array. You need to do Mapping

PUT test

{

  "mappings": {

    "_doc": {

      "properties": {

        "name": {

          "type": "keyword"

        }

      }

    }

  }

}

Indexing document:

PUT test/_doc/1

{    

    "name" : ["yuvi"]

}

Welcome to Intellipaat Community. Get your technical queries answered by top developers !


Categories

...