hadoop HA 之 QJM
前言
本文主要通過(guò)對(duì)hadoop2.2.0集群配置的過(guò)程加以梳理,所有的步驟都是通過(guò)自己實(shí)際測(cè)試。文檔的結(jié)構(gòu)也是根據(jù)自己的實(shí)際情況而定,同時(shí)也會(huì)加入自己在實(shí)際過(guò)程遇到的問(wèn)題。搭建環(huán)境過(guò)程不重要,重要點(diǎn)在于搭建過(guò)程中遇到的問(wèn)題,解決問(wèn)題的過(guò)程。
可能自己遇到的問(wèn)題在一些由經(jīng)驗(yàn)的老者手上都不是問(wèn)題,但是這些問(wèn)題著實(shí)讓自己耽誤了很長(zhǎng)時(shí)間,最后問(wèn)題解決也是費(fèi)了太大心血。也通過(guò)這篇文檔,表現(xiàn)出來(lái),算是總結(jié),為后者提供意見(jiàn)。
Hadoop2.2.0體系結(jié)構(gòu)
要想理解本節(jié)內(nèi)容,首先需要了解hadoop1的體系結(jié)構(gòu)。這里不過(guò)多的介紹基于hadoop1的體系架構(gòu),早在之前,曾搭建hadoop1.2.1偽分布式集群,詳細(xì)請(qǐng)看hadoop學(xué)習(xí)(一)hadoop-1.2.1偽分布式配置及遇到的問(wèn)題。這里主要介紹hadoop2的體系架構(gòu)。
hadoop1的核心組成是兩部分,即HDFS和MapReduce。在hadoop2中變?yōu)镠DFS和Yarn。
新的HDFS中的NameNode不再是只有一個(gè)了,可以有多個(gè)(目前只支持2個(gè))。每一個(gè)都有相同的職能。
這兩個(gè)NameNode的地位如何:一個(gè)是active狀態(tài)的,一個(gè)是standby狀態(tài)的。當(dāng) 集群運(yùn)行時(shí),只有active狀態(tài)的NameNode是正常工作的,standby狀態(tài)的NameNode是處于待命狀態(tài)的,時(shí)刻同步active狀態(tài) NameNode的數(shù)據(jù)。一旦active狀態(tài)的NameNode不能工作,通過(guò)手工或者自動(dòng)切換,standby狀態(tài)的NameNode就可以轉(zhuǎn)變?yōu)?active狀態(tài)的,就可以繼續(xù)工作了。這就是高可靠。
當(dāng)NameNode發(fā)生故障時(shí),他們的數(shù)據(jù)如何保持一致:在這里,2個(gè)NameNode的數(shù)據(jù)其實(shí)是實(shí)時(shí)共享的。新HDFS采用了一種共享機(jī)制,JournalNode集群或者NFS進(jìn)行共享。NFS是操作系統(tǒng)層面的,JournalNode是hadoop層面的,我們這里使用JournalNode集群進(jìn)行數(shù)據(jù)共享。
如何實(shí)現(xiàn)NameNode的自動(dòng)切換:這就需要使用ZooKeeper集群進(jìn)行選擇了。HDFS集群中的兩個(gè)NameNode都在ZooKeeper中注冊(cè),當(dāng)active狀態(tài)的NameNode出故障時(shí),ZooKeeper能檢測(cè)到這種情況,它就會(huì)自動(dòng)把standby狀態(tài)的NameNode切換為active狀態(tài)。
HDFS Federation(HDFS聯(lián)盟):聯(lián)盟的出現(xiàn)是有原因的。我們知道 NameNode是核心節(jié)點(diǎn),維護(hù)著整個(gè)HDFS中的元數(shù)據(jù)信息,那么其容量是有限的,受制于服務(wù)器的內(nèi)存空間。當(dāng)NameNode服務(wù)器的內(nèi)存裝不下數(shù)據(jù)后,那么HDFS集群就裝不下數(shù)據(jù)了,壽命也就到頭了。因此其擴(kuò)展性是受限的。HDFS聯(lián)盟指的是有多個(gè)HDFS集群同時(shí)工作,那么其容量理論上就不受限了,夸張點(diǎn)說(shuō)就是無(wú)限擴(kuò)展。你可以理解成,一個(gè)總集群中,可以虛擬出兩個(gè)或兩個(gè)以上的單獨(dú)的小集群,各個(gè)小集群之間數(shù)據(jù)是實(shí)時(shí)共享的。因?yàn)閔adoop集群中已經(jīng)不在單獨(dú)存在namenode和datanode的概念。當(dāng)一個(gè)其中一個(gè)小集群出故障,可以啟動(dòng)另一個(gè)小集群中的namenode節(jié)點(diǎn),繼續(xù)工作。因?yàn)閿?shù)據(jù)是實(shí)時(shí)共享,即使namenode或datanode一起死掉,也不會(huì)影響整個(gè)集群的正常工作。
配置文件如下:
core-site.xml
1 <?xml version="1.0" encoding="UTF-8"?> 2 <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> 3 <!-- 4 Licensed under the Apache License, Version 2.0 (the "License"); 5 you may not use this file except in compliance with the License. 6 You may obtain a copy of the License at 7 8 http://www.apache.org/licenses/LICENSE-2.0 9 10 Unless required by applicable law or agreed to in writing, software 11 distributed under the License is distributed on an "AS IS" BASIS, 12 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 See the License for the specific language governing permissions and 14 limitations under the License. See accompanying LICENSE file. 15 --> 16 17 <!-- Put site-specific property overrides in this file. --> 18 19 <configuration> 20 <property> 21 <name>fs.defaultFS</name> 22 <value>hdfs://cluster1</value> 23 </property> 24 <property> 25 <name>io.file.buffer.size</name> 26 <value>131072</value> 27 </property> 28 <property> 29 <name>ha.zookeeper.quorum</name> 30 <value>moses.zookeeper0:2181,moses.zookeeper1:2181,moses.zookeeper2:2181,moses.zookeeper3:2181,moses.zookeeper4:2181</value> 31 </property> 32 </configuration>hdfs-site.xml
1 <?xml version="1.0" encoding="UTF-8"?> 2 <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> 3 <!-- 4 Licensed under the Apache License, Version 2.0 (the "License"); 5 you may not use this file except in compliance with the License. 6 You may obtain a copy of the License at 7 8 http://www.apache.org/licenses/LICENSE-2.0 9 10 Unless required by applicable law or agreed to in writing, software 11 distributed under the License is distributed on an "AS IS" BASIS, 12 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 See the License for the specific language governing permissions and 14 limitations under the License. See accompanying LICENSE file. 15 --> 16 17 <!-- Put site-specific property overrides in this file. --> 18 19 <configuration> 20 <property > 21 <name>dfs.replication</name> 22 <value>3</value> 23 </property> 24 <property> 25 <name>dfs.permissions.enabled</name> 26 <value>false</value> 27 </property> 28 <property> 29 <name>dfs.nameservices</name> 30 <value>cluster1</value> 31 </property> 32 <property> 33 <name>dfs.ha.namenodes.cluster1</name> 34 <value>n1,n2</value> 35 </property> 36 <property> 37 <name>dfs.namenode.rpc-address.cluster1.n1</name> 38 <value>moses.namenode:9090</value> 39 </property> 40 <property> 41 <name>dfs.namenode.http-address.cluster1.n1</name> 42 <value>moses.namenode:50070</value> 43 </property> 44 <property> 45 <name>dfs.namenode.rpc-address.cluster1.n2</name> 46 <value>moses.datanode3:9090</value> 47 </property> 48 <property> 49 <name>dfs.namenode.http-address.cluster1.n2</name> 50 <value>moses.datanode3:50070</value> 51 </property> 52 <property> 53 <name>dfs.namenode.servicerpc-address.cluster1.n1</name> 54 <value>moses.namenode:53310</value> 55 </property> 56 <property> 57 <name>dfs.namenode.servicerpc-address.cluster1.n2</name> 58 <value>moses.datanode3:53310</value> 59 </property> 60 <property> 61 <name>dfs.ha.automatic-failover.enabled.cluster1</name> 62 <value>true</value> 63 </property> 64 <property> 65 <name>dfs.namenode.shared.edits.dir</name> 66 <value>qjournal://moses.namenode:8485;moses.datanode1:8485;moses.datanode2:8485;moses.datanode3:8485;moses.datanode4:8485/cluster1</value> 67 </property> 68 <property> 69 <name>dfs.client.failover.proxy.provider.cluster1</name> 70 <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value> 71 </property> 72 <property> 73 <name>dfs.journalnode.edits.dir</name> 74 <value>/data/wapage/journal</value> 75 </property> 76 <property> 77 <name>dfs.ha.fencing.methods</name> 78 <value>sshfence</value> 79 </property> 80 <property> 81 <name>dfs.ha.fencing.ssh.private-key-files</name> 82 <value>/home/wapage/.ssh/id_rsa</value> 83 </property> 84 <property> 85 <name>dfs.ha.fencing.ssh.connect-timeout</name> 86 <value>10000</value> 87 </property> 88 <property> 89 <name>dfs.namenode.handler.count</name> 90 <value>60</value> 91 </property> 92 93 <property> 94 <name>dfs.datanode.max.xcievers</name> 95 <value>4096</value> 96 </property> 97 <property> 98 <name>dfs.namenode.secondary.http-address</name> 99 <value>moses.data.namenode:9091</value> 100 </property> 101 <property> 102 <name>hadoop.tmp.dir</name> 103 <value>/data2/wapage/hadooptmp</value> 104 <description>A base for other temporary directories.</description> 105 </property> 106 <property> 107 <name>dfs.namenode.checkpoint.period</name> 108 <value>600</value> 109 </property> 110 <property> 111 <name>dfs.namenode.name.dir</name> 112 <value>/data/wapage/hadoopname,/data1/wapage/hadoopname,/data2/wapage/hadoopname,/data3/wapage/hadoopname,/data4/wapage/hadoopname,/data5/wapage/hadoopname,/data6/wapage/hadoopname,/data7/wapage/hadoopname</value> 113 <description>Determines where on the local filesystem the DFS name node 114 should store the name table(fsimage). If this is a comma-delimited list 115 of directories then the name table is replicated in all of the 116 directories, for redundancy. </description> 117 </property> 118 <property> 119 <name>dfs.datanode.data.dir</name> 120 <value>/data/wapage/hadoopdata,/data1/wapage/hadoopdata,/data2/wapage/hadoopdata,/data3/wapage/hadoopdata,/data4/wapage/hadoopdata,/data5/wapage/hadoopdata,/data6/wapage/hadoopdata,/data7/wapage/hadoopdata</value> 121 <description>Determines where on the local filesystem an DFS data node 122 should store its blocks. If this is a comma-delimited 123 list of directories, then data will be stored in all named 124 directories, typically on different devices. 125 Directories that do not exist are ignored. 126 </description> 127 </property> 128 <property> 129 <name>dfs.balance.bandwidthPerSec</name> 130 <value>10485760</value> 131 <description> 132 Specifies the maximum amount of bandwidth that each datanode 133 can utilize for the balancing purpose in term of 134 the number of bytes per second. 135 </description> 136 </property> 137 </configuration>參考地址:http://blog.csdn.net/yczws1/article/details/23566383
?
轉(zhuǎn)載于:https://www.cnblogs.com/wq920/p/5624180.html
總結(jié)
以上是生活随笔為你收集整理的hadoop HA 之 QJM的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問(wèn)題。
- 上一篇: r语言 tunerf函数_R语言︱常用统
- 下一篇: 天骄2 mysql错误_凤舞天骄一键版和