ELK Stack – Elastic Search Zen Discovery

ElasticSearch recommends setting up master, data and ingest nodes for a production deployment. The typical structure would be few master nodes, some data nodes and some or no ingest nodes. There is also a client node that is recommended to be installed at the same box where kibana is installed. The client node is only used for loadbalancing of elastic instances that kibana talks with.

Now lets look at how to setup master and data nodes

Elastic has excellent documentation on their website and almost all the required info is out there. But many times they may not be easy to understand until you have actually tried some of the settings and then realized, ah..that’s what this means.

The recommend master node setup has to take into account the split brain scenario. This is about avoiding 2 brains in the cluster, meaning no two master nodes independent of each other. In a multi master node setup, discovery.zen..config has to be defined using the below formula

 (nos of nodes / 2 ) +1

for e.g if there are 3 master nodes then the minimum numbed of nodes required to be online should be

   (3 / 2) + 1 = 2

The following configs go in the elasticsearch.yml file for each node in the cluster.

Master node config:

node.master: true 
node.data: false 
node.ingest: false 

discovery.zen.minimum_master_nodes: 2
discovery.zen.ping.unicast.hosts: ["esmaster0", "esmaster1", "esmaster2"]

Data node config:

node.master: false
node.data: true
node.ingest: false

discovery.zen.minimum_master_nodes: 2
discovery.zen.ping.unicast.hosts: ["esmaster0", "esmaster1", "esmaster2"]

Client node config:

node.master: false
node.data: false
node.ingest: false

discovery.zen.minimum_master_nodes: 2
discovery.zen.ping.unicast.hosts: ["esmaster0", "esmaster1", "esmaster2"]

The important part here is the “unicast.hosts” setting. Notice that the master nodes are the only nodes listed here. elastic is smart enough to identify the data and ingest nodes and route requests accordingly.

When we use a tool like cerebro (detailed in another post) to monitor the ELK stack we would only see the data and ingest nodes and not the master nodes. The master nodes are responsible for lighweight cluster-wide actions like creating and deleting indexes, shard allocation and tracking nodes that are part of the cluster.

That being said, the master nodes can have a little less memory than say a data or ingest nodes as these nodes do process a huge volume of data, depending on the stack size, and thus require more memory and powerful CPU.

References: https://www.elastic.co/guide/en/elasticsearch/reference/5.1/modules-node.html#split-brain

Leave a Reply Cancel reply