ElasticSearch recommends setting up master, data and ingest nodes for a production deployment. The typical structure would be few master nodes, some data nodes and some or no ingest nodes. There is also a client node that is recommended to be installed at the same box where kibana is installed. The client node is only used for loadbalancing of elastic instances that kibana talks with.
Now lets look at how to setup master and data nodes
Elastic has excellent documentation on their website and almost all the required info is out there. But many times they may not be easy to understand until you have actually tried some of the settings and then realized, ah..that’s what this means.
The recommend master node setup has to take into account the split brain scenario. This is about avoiding 2 brains in the cluster, meaning no two master nodes independent of each other. In a multi master node setup, discovery.zen..config has to be defined using the below formula
(nos of nodes / 2 ) +1
for e.g if there are 3 master nodes then the minimum numbed of nodes required to be online should be
(3 / 2) + 1 = 2
The following configs go in the elasticsearch.yml file for each node in the cluster.
Master node config:
node.master: true node.data: false node.ingest: false discovery.zen.minimum_master_nodes: 2 discovery.zen.ping.unicast.hosts: ["esmaster0", "esmaster1", "esmaster2"]
Data node config:
node.master: false node.data: true node.ingest: false discovery.zen.minimum_master_nodes: 2 discovery.zen.ping.unicast.hosts: ["esmaster0", "esmaster1", "esmaster2"]
Client node config:
node.master: false node.data: false node.ingest: false discovery.zen.minimum_master_nodes: 2 discovery.zen.ping.unicast.hosts: ["esmaster0", "esmaster1", "esmaster2"]
The important part here is the “unicast.hosts” setting. Notice that the master nodes are the only nodes listed here. elastic is smart enough to identify the data and ingest nodes and route requests accordingly.
When we use a tool like cerebro (detailed in another post) to monitor the ELK stack we would only see the data and ingest nodes and not the master nodes. The master nodes are responsible for lighweight cluster-wide actions like creating and deleting indexes, shard allocation and tracking nodes that are part of the cluster.
That being said, the master nodes can have a little less memory than say a data or ingest nodes as these nodes do process a huge volume of data, depending on the stack size, and thus require more memory and powerful CPU.
References: https://www.elastic.co/guide/en/elasticsearch/reference/5.1/modules-node.html#split-brain
Advertisements
Advertisements