linux上部署hadoop集群 HA-QJM篇
環境
基礎篇需要4臺機器(一臺namenode,三臺datanode);
HA篇需要8臺機器:兩臺namenode(一臺作active nn,另一臺作standby nn),三臺datanode,三臺zookeeper(也可以省去這三臺,把zookeeper daemon部署在其他機器上)。實際上還需要3臺journalnode,但因為它比較輕量級,所以這里就把它部署在datanode上了。
三臺zookeeper機器上配置以下信息:
1 創建hadoop用戶
2 做好ssh免密碼登陸
3 修改主機名
4 安裝JDK
5 下載zookeeper安裝包
下載地址:http://mirror.nus.edu.sg/apache/zookeeper
下載zookeeper-3.4.6到/opt/目錄下,解壓
6 修改/etc/profile
export ZOO_HOME=/opt/zookeeper-3.4.6
export ZOO_LOG_DIR=/opt/zookeeper-3.4.6/logs
使之生效:
source /etc/profile
7 建立zookeeper數據存放目錄:
mkdir /opt/zookeeper-3.4.6/data
8 在$ZOO_HOME/conf下創建配置文件:
vi zoo.cfg 加入以下內容:
# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just
# example sakes.
dataDir=/opt/zookeeper-3.4.6/data
# the port at which the clients will connect
clientPort=2181
server.1=10.9.214.167:31316:31317
server.2=10.9.214.18:31316:31317
server.3=10.9.214.211:31316:31317
9 在/opt/zookeeper-3.4.6/data/目錄下創建文件myid,并寫入內容,zookeeper1寫1,zookeeper2寫2,zookeeper3寫3 ,如:
echo 1 >/opt/zookeeper-3.4.6/data/myid
10 啟動zookeeper 服務:
cd $ZOO_HOME
./bin/zkServer.sh start
11 驗證
測試zookeeper集群是否建立成功,在$ZOO_HOME目錄下執行以下命令即可,如無報錯表示集群創建成功:
./bin/zkCli.sh -server localhost:31315
hadoop配置文件只需要修改core-site.xml和hdfs-site.xml
配置core-site.xml
??? <property>
??????? <name>hadoop.tmp.dir</name>
??????? <value>/opt/hadoop-2.6.0/tmp</value>
??? </property>
??? <property>
??????? <name>fs.default.name</name>
??????? <value>hdfs://10.9.214.151:9000</value>
??? </property>
??? <property>
??????? <name>hadoop.proxyuser.root.hosts</name>
??????? <value>10.9.214.151</value>
??? </property>
??? <property>
??????? <name>hadoop.proxyuser.root.groups</name>
??????? <value>*</value>
??? </property>
??? <property>
????? <name>fs.defaultFS</name>
??????? <value>hdfs://cluster_haohzhang</value>
??????? </property>
???? <property>
??????? <name>ha.zookeeper.quorum</name>
?????????? <value>10.9.214.167:2181,10.9.214.18:2181,10.9.214.211:2181</value>
??????????? </property>
配置hdfs-site.xml
???? <property>
???????????? <name>dfs.namenode.name.dir</name>
???????????????????? <value>file:/opt/hadoop-2.6.0/hdfs/name</value>
???????????????????????? </property>
??? <property>
??????????? <name>dfs.dataname.data.dir</name>
??????????????????? <value>file:/opt/hadoop-2.6.0/hdfs/data</value>
??????????????????????? </property>
??? <property>
??????????? <name>dfs.replication</name>
??????????????????? <value>3</value>
??????????????????????? </property>
??? <property>
????? <name>dfs.nameservices</name>
??????? <value>cluster_haohzhang</value>
??????? </property>
??? <property>
????? <name>dfs.ha.namenodes.cluster_haohzhang</name>
??????? <value>nn1,nn2</value>
??????? </property>
??? <property>
????? <name>dfs.namenode.rpc-address.cluster_haohzhang.nn1</name>
??????? <value>10.9.214.151:8020</value>
??????? </property>
??????? <property>
????????? <name>dfs.namenode.rpc-address.cluster_haohzhang.nn2</name>
??????????? <value>10.9.214.15:8020</value>
??????????? </property>
???? <property>
?????? <name>dfs.namenode.http-address.cluster_haohzhang.nn1</name>
???????? <value>10.9.214.151:50070</value>
???????? </property>
???????? <property>
?????????? <name>dfs.namenode.http-address.cluster_haohzhang.nn2</name>
???????????? <value>10.9.214.15:50070</value>
???????????? </property>
???? <property>
???? <property>
?????? <name>dfs.namenode.shared.edits.dir</name>
???????? <value>qjournal://10.9.214.158:8485;10.9.214.160:8485;10.9.214.149:8485/cluster_haohzhang</value>
???????? </property>
???? <property>
?????? <name>dfs.client.failover.proxy.provider.cluster_haohzhang</name>
???????? <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
???????? </property>
???? <property>
?????? <name>dfs.ha.fencing.methods</name>
???????? <value>sshfence</value>
???????? </property>
???? <property>
?????? <name>dfs.ha.fencing.methods</name>
???????? <value>sshfence</value>
???????? </property>
???? <property>
?????? <name>dfs.ha.fencing.ssh.private-key-files</name>
???????? <value>/home/hadoop/.ssh/id_rsa</value>
???????? </property>
???? <property>
?????? <name>dfs.journalnode.edits.dir</name>
???????? <value>/opt/hadoop-2.6.0/journalnode</value>
???????? </property>
????? <property>
???????? <name>dfs.ha.automatic-failover.enabled</name>
??????????? <value>true</value>
???????????? </property>
操作細節
1 先刪除所有namenode和datanode,journalnode上的metadata
2 啟動三個journalnode進程
hadoop-daemon.sh?start?journalnode3 格式化namenode
在一臺namenode上執行:
hdfs?namenode?-format這個步驟會連接journalnode,然后會把journalnode也格式化掉
4 啟動剛剛格式化的namenode上的hdfs:
cd?$HADOOP_HOME/sbin;?./start-dfs.sh5 在另一臺namenode上執行:
hdfs?namenode?-bootstrapStandby6 驗證手動fail over
在任意一個namenode上執行:
hdfs?haadmin?-help?<command>可以查看命令用法,這里我們用
hdfs?haadmin?-getServiceState?nn1 hdfs?haadmin?-getServiceState?nn2獲取兩個namenode的狀態,有兩種狀態:standby , active
手動切換狀態:
成功的化,nn2就成了active狀態了
7 用zookeeper自動切換
7.1 在其中一個namenode上初始化zkfc
hdfs?zkfc?-formatZK這步會嘗試連接zookeeper上的2181端口,并在zookeeper里面創建一個znode
7.2 在namenode上啟動hdfs
cd?$HADOOP_HOME;?./start-dfs.sh7.3 驗證進程是否都啟動成功
[hadoop@hadoopmaster-standby sbin]$ jps
12277 NameNode
12871 Jps
12391 DFSZKFailoverController
[hadoop@hadoopslave1 hadoop-2.6.0]$ jps
7698 DataNode
7787 JournalNode
7933 Jps
7.4 驗證failover自動切換
殺掉active namenode上的所有hadoop進程:
然后查看另外一個namenode是否已經從standby 變為active狀態,
注意:配置文件中默認每5妙鐘檢測一次健康狀態
轉載于:https://blog.51cto.com/haohaozhang/1606714
總結
以上是生活随笔為你收集整理的linux上部署hadoop集群 HA-QJM篇的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: python的c语言扩展方法简介
- 下一篇: nginx反向代理原理简介