ElasticSearch Cookbook(Second Edition)
上QQ阅读APP看书,第一时间看更新

Mapping arrays

An array or a multivalue field is very common in data models (such as multiple phone numbers, addresses, names, aliases, and so on), but it is not natively supported in traditional SQL solutions.

In SQL, multivalue fields require the creation of accessory tables that must be joined in order to gather all the values, leading to poor performance when the cardinality of records is huge.

ElasticSearch, which works natively in JSON, provides support for multivalue fields transparently.

Getting ready

You need a working ElasticSearch cluster.

How to do it...

Every field is automatically managed as an array. For example, in order to store tags for a document, this is how the mapping must be:

{
  "document" : {
    "properties" : {
      "name" : {"type" : "string",  "index":"analyzed"},"tag" : {"type" : "string", "store" : "yes" , "index":"not_analyzed"},…
…
    }
  }
}

This mapping is valid for indexing this document:

{"name": "document1", "tag": "awesome"}

It can also be used for the following document:

{"name": "document2", "tag": ["cool", "awesome", "amazing"]}

How it works...

ElasticSearch transparently manages the array; there is no difference whether you declare a single value or multiple values, due to its Lucene core nature.

Multiple values for a field are managed in Lucene by adding them to a document with the same field name (index_name in ES). If the index_name field is not defined in the mapping, it is taken from the name of the field. This can also be set to other values for custom behaviors, such as renaming a field at the indexing level or merging two or more JSON fields into a single Lucene field. Redefining the index_name field must be done with caution, as it impacts the search too. For people with a SQL background, this behavior might be strange, but this is a key point in the NoSQL world as it reduces the need for a join query and the need to create different tables to manage multiple values. An array of embedded objects has the same behavior as that of simple fields.