Elastic Search Simplified: Part 2

Nitin Agarwal
Level Up Coding
Published in
6 min readOct 13, 2020

--

In this part, we will discuss ElasticSearch top-down. Please refer to Part 1 of this series if you want to understand ElasticSearch bottom up.

This section has a prerequisite of high-level understanding of distributed systems(consistency, availability, durability etc)

  1. What is an Elastic Search cluster
  2. Shards
  3. Segments in Lucene Index
  4. The journey of an indexing request
  5. The journey of a search request
  6. Thread pools

What is an Elastic Search cluster

An Elasticsearch cluster is a group of one or more Elasticsearch nodes instances that are connected together. The power of an Elasticsearch cluster lies in the distribution of tasks, searching and indexing, across all the nodes in the cluster

The elastic search cluster nodes can take up the below roles. These roles combined can be assigned to the same node, and we can also have dedicated nodes for these roles depending on the use case.

  • Master nodes — The master node is responsible for lightweight cluster-wide actions such as creating or deleting an index, tracking which nodes are part of the cluster, and deciding which shards to allocate to which nodes. In a high availability setup, we should have at least 3 nodes to a quorum.
  • Data nodes — stores data and executes data-related operations such as search and aggregation
  • Client/Coordinating nodes —Every node has the role of coordinating node by default, we also have the option to have a dedicated coordinating node as well. We will see the use case of this node later in this article.
  • Ingest node, Remote-eligible node, Machine learning node, for more details refer to the documentation

Shards

Shard of an elastic search index is a way to divide the index into smaller chunks.

  • It allows you to horizontally split/scale your content volume
  • It allows you to distribute and parallelize operations across shards (potentially on multiple nodes) thus increasing performance/throughput

We have the option to set the number of shards when we create an index, at the same time it is not a trivial task be to change the shard number once the index has been created. Choosing the correct shard number is very important for optimal performance when we have an index which is GBs of size. Having shards in the range of 1000s can very easily become a bottleneck for performance. Choosing the correct shard number is an advanced topic and the correct number varies from use case to use.

Shard Replicas

  • It provides high availability in case a shard/node fails. For this reason, it is important to note that a replica shard is never allocated on the same node as the original/primary shard that it was copied from.
  • It allows you to scale out your search volume/throughput since searches can be executed on all replicas in parallel.
  • The primary shard is used at the time of indexing.

Segments

Elastic Search indexes are actually sharded Apache Lucene indexes. Lucene has been optimized for performance and to achieve performance each individual Lucene index is divided into smaller files called segments. These segments are immutable, whenever we update a document, it actually gets created in a new segment, the old document is marked for deletion and elastic search periodically merges segments, which deletes old segments. We can force merge segments, but merging does have a CPU and I/O cost. Elastic Search also balances the shards periodically.

Lucene searches in all segments sequentially. So having a lot of segments can impact performance.

The Journey of an Indexing Request

The request initially reaches the coordinating node.

Routing

The request is routed to the primary for that shard using the routing table . The primary shard makes sure the index operation is replicated as per the setting(default setting is of “quorum”)

In case of a bulk operation where we will be indexing multiple documents, the request is sent out separate shard’s primaries in parallel.

Index Operation Delay

When we wrote an integration test for an index operation we saw that our tests were failing and whenever we would debug with breakpoints it will work, putting 1 or 2-second sleep between indexing and a search request made it work consistently. The reason for this is that the shards acknowledge once they have written the document to a “transaction log” (similar to “write ahead log” like most databases), but they are not yet part of the live index.

We can also make the index available immediately by using refresh

PUT /test/_doc/1?refresh=true|false|wait_for
{"test": "test"}

Lucene writes to an immutable segment and is eventually flushed to disk. The flush operation is not synchronized across nodes, so it is possible to get different results for a short time period for the same request as well.

Journey of Search Requests

Like with index requests, the search requests first reach the coordinator node and are routed.

If it is a multi_match query, the request will be sent to all distinct shards and in case we have a filter query, in which case we will have the routing, it will be sent to the corresponding shard.

{
"query": {
"bool": {
"must": [{
"match": {
"name": "Brunao"
}
}]
}
}
}

The query will run in two phases

Phase 1

In phase 1, the coordinating node will fetch all matching document ids.

Note that query term is tokenised, lowercased etc according to the analyser of the respective field, before searching

Consider we are given size parameter like -> {{host}}:{{port}}/foobar/_search?size=10

In this case the coordinator node will merge results from all shards and get the top 10 doc_ids based on the scores calculated.

Phase 2

using the ids of the result of phase 1, the coordinating node will reach out to the shards holding the documents and finally return the full documents.

Thread Pools

Elastic Search is written in java and uses several thread pools for indexing/searching/routing etc on the nodes. Some of the important thread pools used are

  • get
  • analyze
  • write
  • bulk
  • flush
  • force_merge
  • cluster management

Thread pools have below types

fixed

These types have a fixed number of threads and a fixed max queue size to queue the pending request

fixed_auto_queue_size

It is similar to fixed, the max queue size automatically adjusts according to some algorithm.

scaling

the scaling parameter holds a dynamic number of threads, subject to the parameters provided.

All thread pools have a default value set which can be modified, but you should really know what you are doing in case you plan to play around with those.

References

--

--