ES选主策略

news/2024/7/5 3:01:33

ES版本5.6.3

1、整个流程的开始,实在node启动后触发的,Node.java中start()方法,通过调用ZenDiscovery.java中的doStart()方法,之后会调用startInitialJoin方法开始进行加入现有的cluster或者选主。

public void startInitialJoin() {// start the join thread from a cluster state update. See {@link JoinThreadControl} for details.clusterService.submitStateUpdateTask("initial_join", new LocalClusterUpdateTask() {@Overridepublic ClusterTasksResult<LocalClusterUpdateTask> execute(ClusterState currentState) throws Exception {// do the join on a different thread, the DiscoveryService waits for 30s anyhow till it is discovered
                joinThreadControl.startNewThreadIfNotRunning();return unchanged();}@Overridepublic void onFailure(String source, @org.elasticsearch.common.Nullable Exception e) {logger.warn("failed to start initial join process", e);}});}

2、ZenDiscovery类中startNewThreadIfNotRunning方法中innerJoinCluster()为实质性进行选主操作,其中findMaster()选择master节点。 

private void innerJoinCluster() {DiscoveryNode masterNode = null;final Thread currentThread = Thread.currentThread();nodeJoinController.startElectionContext();while (masterNode == null && joinThreadControl.joinThreadActive(currentThread)) {masterNode = findMaster();}......}

 

3、在findMaster()中。通过pingAndWait()方法获取当前可以ping通的节点,并获取PingResponse,此信息中包含节点信息以及该节点当前的master节点信息。之后,根据获取的节点开始进行选主。

  此处有一参数需要注意:discovery.zen.master_election.ignore_non_master_pings,默认值为false,表明数据节点(node.master: false    node.data: true)是否参与选主,一般我们集群节点数较少时,不用修改此配置,如果集群规模很大,可以考虑只允许主节点参与选主操作。

  然后,根据获取的pingResponses来判断当前是否有master节点存在,存储在activeMasters中,对于master的候选节点存储在masterCandidates中。

  如果activeMasters为空,表明当前并未有master节点存在,则进行选主操作,即步骤4。这里需要注意的是discovery.zen.minimum_master_nodes,候选节点数必选大于等于该参数,选主才能继续,否则是无法选主的。该参数一般配置为(N/2)+1,防止集群出现脑裂。

  如果activeMasters不为空,则当前的master节点,即为步骤5中的找到的master节点。

 private DiscoveryNode findMaster() {logger.trace("starting to ping");List<ZenPing.PingResponse> fullPingResponses = pingAndWait(pingTimeout).toList();if (fullPingResponses == null) {logger.trace("No full ping responses");return null;}if (logger.isTraceEnabled()) {StringBuilder sb = new StringBuilder();if (fullPingResponses.size() == 0) {sb.append(" {none}");} else {for (ZenPing.PingResponse pingResponse : fullPingResponses) {sb.append("\n\t--> ").append(pingResponse);}}logger.trace("full ping responses:{}", sb);}final DiscoveryNode localNode = clusterService.localNode();// add our selvesassert fullPingResponses.stream().map(ZenPing.PingResponse::node).filter(n -> n.equals(localNode)).findAny().isPresent() == false;fullPingResponses.add(new ZenPing.PingResponse(localNode, null, clusterService.state()));// filter responsesfinal List<ZenPing.PingResponse> pingResponses = filterPingResponses(fullPingResponses, masterElectionIgnoreNonMasters, logger);List<DiscoveryNode> activeMasters = new ArrayList<>();for (ZenPing.PingResponse pingResponse : pingResponses) {// We can't include the local node in pingMasters list, otherwise we may up electing ourselves without// any check / verifications from other nodes in ZenDiscover#innerJoinCluster()if (pingResponse.master() != null && !localNode.equals(pingResponse.master())) {activeMasters.add(pingResponse.master());}}// nodes discovered during pingingList<ElectMasterService.MasterCandidate> masterCandidates = new ArrayList<>();for (ZenPing.PingResponse pingResponse : pingResponses) {if (pingResponse.node().isMasterNode()) {masterCandidates.add(new ElectMasterService.MasterCandidate(pingResponse.node(), pingResponse.getClusterStateVersion()));}}if (activeMasters.isEmpty()) {if (electMaster.hasEnoughCandidates(masterCandidates)) {final ElectMasterService.MasterCandidate winner = electMaster.electMaster(masterCandidates);logger.trace("candidate {} won election", winner);return winner.getNode();} else {// if we don't have enough master nodes, we bail, because there are not enough master to elect fromlogger.warn("not enough master nodes discovered during pinging (found [{}], but needed [{}]), pinging again",masterCandidates, electMaster.minimumMasterNodes());return null;}} else {assert !activeMasters.contains(localNode) : "local node should never be elected as master when other nodes indicate an active master";// lets tie break between discovered nodesreturn electMaster.tieBreakActiveMasters(activeMasters);}}

4、ElectMasterService.java中的electMaster()方法为选主的具体实现,逻辑十分简单根据当前的候选节点进行排序,排在第一个的即为master节点。

public MasterCandidate electMaster(Collection<MasterCandidate> candidates) {assert hasEnoughCandidates(candidates);List<MasterCandidate> sortedCandidates = new ArrayList<>(candidates);sortedCandidates.sort(MasterCandidate::compare);return sortedCandidates.get(0);}

 

 5、当存在master节点时,则加入现有的集群中,如果是多个master节点,则会选择排在第一个的master节点作为需要加入的集群。

public DiscoveryNode tieBreakActiveMasters(Collection<DiscoveryNode> activeMasters) {return activeMasters.stream().min(ElectMasterService::compareNodes).get();}

 

 

6、在选择完master节点后,需要进行集群组建了。如果当前选择出的master节点为本节点,则本节点需要等待其他节点来加入。这个逻辑不太确定,是异步实现的。大体如下:在每个node启动时,均会注册internal:discovery/zen/join请求,待其为maser后,其他节点通过该请求与之通信加入到master所在集群。在MembershipListener.java中注册监听,当有节点加入时,通过异步函数将信息存储在NodeJoinController下的内部类ElectionContext中,具体判断是否获取了足够的节点,判断当次选举是否成功。--------这块逻辑比较绕,不确定理解是否正确,如哪位大神比较了解,望不吝赐教啊!!!!

7、如果选择的master节点不是本节点,则选择加入该集群ZenDiscovery.java中joinElectedMaster(),尝试次数为discovery.zen.join_retry_attempts由控制,默认为3次,每次的超时时间:discovery.zen.join_timeout控制,默认值为discovery.zen.ping_timeout*20也就是60ms。所以这个参数不宜配置过长,否则在选举失败的超时时间就会比较长。如果加入master失败或者超时,则会进行新的一轮选主,直到选则出满足条件的master节点。

private boolean joinElectedMaster(DiscoveryNode masterNode) {try {// first, make sure we can connect to the master
            transportService.connectToNode(masterNode);} catch (Exception e) {logger.warn((Supplier<?>) () -> new ParameterizedMessage("failed to connect to master [{}], retrying...", masterNode), e);return false;}int joinAttempt = 0; // we retry on illegal state if the master is not yet readywhile (true) {try {logger.trace("joining master {}", masterNode);membership.sendJoinRequestBlocking(masterNode, clusterService.localNode(), joinTimeout);return true;} catch (Exception e) {final Throwable unwrap = ExceptionsHelper.unwrapCause(e);if (unwrap instanceof NotMasterException) {if (++joinAttempt == this.joinRetryAttempts) {logger.info("failed to send join request to master [{}], reason [{}], tried [{}] times", masterNode, ExceptionsHelper.detailedMessage(e), joinAttempt);return false;} else {logger.trace("master {} failed with [{}]. retrying... (attempts done: [{}])", masterNode, ExceptionsHelper.detailedMessage(e), joinAttempt);}} else {if (logger.isTraceEnabled()) {logger.trace((Supplier<?>) () -> new ParameterizedMessage("failed to send join request to master [{}]", masterNode), e);} else {logger.info("failed to send join request to master [{}], reason [{}]", masterNode, ExceptionsHelper.detailedMessage(e));}return false;}}try {Thread.sleep(this.joinRetryDelay.millis());} catch (InterruptedException e) {Thread.currentThread().interrupt();}}}

 

 

至此,master就已经选择完成了。大概逻辑就是这样,可能中间一些细节有待进一步深究。

 

转载于:https://www.cnblogs.com/woniu4/p/9348664.html


http://lihuaxi.xjx100.cn/news/255070.html

相关文章

驱动数字经济加速,摩尔线程发布全新元计算架构MUSA和GPU产品

2022年3月30日&#xff0c;北京——摩尔线程今天举行主题为“元动力 创无限”的春季发布会。摩尔线程创始人兼CEO张建中解读了“元计算”这一产业趋势&#xff0c;并发布全新架构及系列重磅新品&#xff0c;包括&#xff1a;MUSA&#xff08;Moore Threads Unified System Arch…

数字家庭开发者中心

数字家庭开发者中心 http://www.adobe.com/devnet/devices/digital_home.html转载于:https://www.cnblogs.com/kobo/archive/2010/07/06/1772136.html

MySQL5.7配置日志

之前使用MySQL 5.1版本的时候&#xff0c;修改my.cnf&#xff0c;在[mysqld]下添加"log/data/mysql/query.log"&#xff0c;重启服务就ok了 但是在5.7会出现 Starting MySQL... ERROR! The server quit without updating PID file (/data/mysql/mysql.pid).原因是5.7…

LeetCode实战:除自身以外数组的乘积

背景 为什么你要加入一个技术团队&#xff1f;如何加入 LSGO 软件技术团队&#xff1f;我是如何组织“算法刻意练习活动”的&#xff1f;为什么要求团队的学生们写技术Blog 题目英文 Given an array nums of n integers where n > 1, return an array output such that ou…

IDEA公司再发新神器!超越 VS Code 骚操作!

欢迎关注方志朋的博客&#xff0c;回复”666“获面试宝典JetBrains 长期以来一直是 IDE 业务的重要参与者。到目前为止&#xff0c;他在 VS Code 上最大的竞争对手是 IntelliJ IDEA 和 WebStormJetBrains Fleet 新一代&#xff0c;VS Code杀手fc369c3a8b50159824ed8d9eeae47a2a…

显示所有文件和文件夹无论如何 无法被设置

问题&#xff1a;XP系统选显示所有文件和文件夹确定后没有任何反应再次打开文件夹选项里面仍是不显示隐藏的文件和文件夹 答案&#xff1a;在记事本粘贴下面文字&#xff0c;另存为所有文件&#xff0c; .reg 格式。成功的话图标变为绿色碎方块。在双击它。 Windows Registry E…

科大讯飞神经影像疾病预测方案!

赛题背景脑部MRI(Magnetic Resonance Imaging )全称为脑部磁共振显像&#xff0c;是反映脑部病变组织结构的显像&#xff0c;它根据有磁距的原子核在磁场作用下&#xff0c;能产生能级间的跃迁的原理从而为临床提供疾病的生物脑部结构信息。现代医学临床实践表明使用MRI可极大的…

如何用git命令行上传本地代码到github

如何用git命令行上传本地代码到github 2016年09月19日 16:10:36 阅读数&#xff1a;9337注意&#xff1a;安装的前提条件是配置好git的相关环境或者安装好git.exe&#xff0c;此处不再重点提及 上传的步骤&#xff1a;(本文采用git 命令界面进行操作) &#xff08; git config …