Linux下部署Hadoop伪分布模式
Hadoop版本為1.2.1
Distribution為Fedora19并使用hadoop賬號安裝
第一步:配置ssh本地登錄證書(雖然為偽分布模式,Hadoop依然會使用SSH進(jìn)行通信)
[hadoop@promote ~]$ which ssh /usr/bin/ssh [hadoop@promote ~]$ which ssh-keygen /usr/bin/ssh-keygen [hadoop@promote ~]$ which sshd /usr/sbin/sshd [hadoop@promote ~]$ ssh-keygen -t rsa然后一路回車
Generating public/private rsa key pair. Enter file in which to save the key (/home/hadoop/.ssh/id_rsa): Created directory '/home/hadoop/.ssh'. Enter passphrase (empty for no passphrase): Enter same passphrase again: Passphrases do not match. Try again. Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /home/hadoop/.ssh/id_rsa. Your public key has been saved in /home/hadoop/.ssh/id_rsa.pub. The key fingerprint is: 2f:a9:60:c7:dc:38:8f:c7:bb:70:de:d4:39:c3:39:87 hadoop@promote.cache-dns.local The key's randomart image is: +--[ RSA 2048]----+ | | | | | | | | | S | | o o o o + | | o B.= o E . | | . o Oo+ = | | o.=o. | +-----------------+最終將在/home/hadoop/.ssh/路徑下生成私鑰id_rsa和公鑰id_rsa.pub
[hadoop@promote .ssh]$ cd /home/hadoop/.ssh/ [hadoop@promote .ssh]$ ls id_rsa id_rsa.pub修改sshd服務(wù)配置文件:
[hadoop@promote .ssh]$ su root 密碼: [root@promote .ssh]# vi /etc/ssh/sshd_config啟用RSA加密算法驗證
RSAAuthentication yes PubkeyAuthentication yes# The default is to check both .ssh/authorized_keys and .ssh/authorized_keys2 # but this is overridden so installations will only check .ssh/authorized_keys AuthorizedKeysFile .ssh/authorized_keys保存并退出,然后重啟sshd服務(wù)
[root@promote .ssh]# service sshd restart Redirecting to /bin/systemctl restart sshd.service然后切換回hadoop用戶,將ssh證書公鑰拷貝至/home/hadoop/.ssh/authorized_keys文件中
[root@promote .ssh]# su hadoop [hadoop@promote .ssh]$ cat id_rsa.pub >> authorized_keys修改~/.ssh/authorized_keys文件的權(quán)限為644,~/.ssh文件夾的權(quán)限為700,/home/hadoop文件夾的權(quán)限為700(權(quán)限正確是成功認(rèn)證的先決條件)
[hadoop@promote .ssh]$ chmod 644 authorized_keys [hadoop@promote .ssh]$ ssh 192.168.211.129 The authenticity of host 192.168.211.129(192.168.211.129)' can't be established. RSA key fingerprint is 25:1f:be:72:7b:83:8e:c7:96:b6:71:35:fc:5d:2e:7d. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added '192.168.211.129' (RSA) to the list of known hosts. Last login: Thu Feb 13 23:42:43 2014?第一次登陸將會將證書內(nèi)容保存在/home/hadoop/.ssh/known_hosts文件中,以后再次登陸將不需要輸入密碼
[hadoop@promote .ssh]$ ssh 192.168.211.129 Last login: Thu Feb 13 23:46:04 2014 from 192.168.211.129至此ssh證書部分配置完成
第二步:安裝JDK
[hadoop@promote ~]$ java -version java version "1.7.0_25" OpenJDK Runtime Environment (fedora-2.3.10.3.fc19-i386) OpenJDK Client VM (build 23.7-b01, mixed mode)將OpenJDK換為Oracle的Java SE
[hadoop@promote .ssh]$ cd ~ [hadoop@promote ~]$ uname -i i386在Oracle的官網(wǎng)下載jdk-6u45-linux-i586.bin后上傳至服務(wù)器,賦予權(quán)限并進(jìn)行安裝,最后刪除安裝包
[hadoop@promote ~]$ chmod u+x jdk-6u45-linux-i586.bin [hadoop@promote ~]$ ./jdk-6u45-linux-i586.bin [hadoop@promote ~]$ rm -rf jdk-6u45-linux-i586.bin出現(xiàn)以下結(jié)果說明JDK成功安裝:
[hadoop@promote ~]$ /home/hadoop/jdk1.6.0_45/bin/java -version java version "1.6.0_45" Java(TM) SE Runtime Environment (build 1.6.0_45-b06) Java HotSpot(TM) Client VM (build 20.45-b01, mixed mode, sharing)第三步:安裝Hadoop
在Hadoop官網(wǎng)下載hadoop-1.2.1.tar.gz并上傳至服務(wù)器/home/hadoop路徑下
[hadoop@promote ~]$ tar -xzf hadoop-1.2.1.tar.gz [hadoop@promote ~]$ rm -rf hadoop-1.2.1.tar.gz [hadoop@promote ~]$ cd hadoop-1.2.1/conf/ [hadoop@promote conf]$ vi hadoop-env.sh將JAVA_HOME指向第二步安裝的JDK所在目錄
# The java implementation to use. Required. export JAVA_HOME=/home/hadoop/jdk1.6.0_45保存并退出
[hadoop@promote ~]$ vi ~/.bash_profile環(huán)境變量PATH后接上Hadoop和JDK的bin目錄
...... PATH=$PATH:$HOME/.local/bin:$HOME/bin:/home/hadoop/hadoop-1.2.1/bin:/home/hadoop/jdk1.6.0_45/bin export PATH保存并退出,退出登錄并重新使用hadoop賬號登錄,如果出現(xiàn)如下結(jié)果,說明環(huán)境變量PATH設(shè)置成功
[hadoop@promote ~]$ echo $PATH /usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/home/hadoop/.local/bin: /home/hadoop/bin:/home/hadoop/hadoop-1.2.1/bin:/home/hadoop/jdk1.6.0_45/bin第四步:修改Hadoop配置文件
修改core-site.xml(使用IP地址而不是主機名或localhost的好處是不需要修改/etc/hosts):
<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?><configuration> <property> <name>fs.default.name</name> <value>hdfs://192.168.211.129:9000</value> </property> <property> <name>hadoop.tmp.dir</name> <value>/home/hadoop/hadooptmp</value> </property> </configuration>修改mapred-site.xml:
<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?><configuration> <property> <name>mapred.job.tracker</name> <value>192.168.211.129:9001</value> </property> </configuration>修改hdfs-site.xml:
<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?><configuration> <property> <name>dfs.replication</name> <value>1</value> </property> </configuration>master中指定的SNN節(jié)點和slaves中指定的從節(jié)點位置均為本地
[hadoop@promote conf]$ cat masters 192.168.211.129 [hadoop@promote conf]$ cat slaves 192.168.211.129第五步:初始化HDFS文件系統(tǒng)
特別要注意的是,Hadoop并不識別帶“_”的主機名,所以如果你的主機名帶有“_”,一定要進(jìn)行修改,修改方式參照http://blog.csdn.net/a19881029/article/details/20485079
[hadoop@fedora ~]$ hadoop namenode -format 14/03/04 22:13:41 INFO namenode.NameNode: STARTUP_MSG: /************************************************************ STARTUP_MSG: Starting NameNode STARTUP_MSG: host = fedora/127.0.0.1 STARTUP_MSG: args = [-format] STARTUP_MSG: version = 1.2.1 STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.2 -r 1503152; compiled by 'mattf' on Mon Jul 22 15:23:09 PDT 2013 STARTUP_MSG: java = 1.6.0_45 ************************************************************/ 14/03/04 22:13:42 INFO util.GSet: Computing capacity for map BlocksMap 14/03/04 22:13:42 INFO util.GSet: VM type = 32-bit 14/03/04 22:13:42 INFO util.GSet: 2.0% max memory = 1013645312 14/03/04 22:13:42 INFO util.GSet: capacity = 2^22 = 4194304 entries 14/03/04 22:13:42 INFO util.GSet: recommended=4194304, actual=4194304 14/03/04 22:13:42 INFO namenode.FSNamesystem: fsOwner=hadoop 14/03/04 22:13:42 INFO namenode.FSNamesystem: supergroup=supergroup 14/03/04 22:13:42 INFO namenode.FSNamesystem: isPermissionEnabled=true 14/03/04 22:13:42 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100 14/03/04 22:13:42 INFO namenode.FSNamesystem: isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s) 14/03/04 22:13:42 INFO namenode.FSEditLog: dfs.namenode.edits.toleration.length = 0 14/03/04 22:13:42 INFO namenode.NameNode: Caching file names occuring more than 10 times 14/03/04 22:13:53 INFO common.Storage: Image file /tmp/hadoop-hadoop/dfs/name/ current/fsimage of size 112 bytes saved in 0 seconds. 14/03/04 22:13:53 INFO namenode.FSEditLog: closing edit log: position=4, editlog=/tmp/hadoop-hadoop/dfs/name/current/edits 14/03/04 22:13:53 INFO namenode.FSEditLog: close success: truncate to 4, editlog=/tmp/hadoop-hadoop/dfs/name/current/edits 14/03/04 22:13:53 INFO common.Storage: Storage directory /tmp/hadoop-hadoop/dfs/name has been successfully formatted. 14/03/04 22:13:53 INFO namenode.NameNode: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down NameNode at fedora/127.0.0.1 ************************************************************/第六步:啟動Hadoop
[hadoop@fedora logs]$ start-all.sh starting namenode, logging to /home/hadoop/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-namenode-fedora.out localhost: starting datanode, logging to /home/hadoop/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-datanode-fedora.out localhost: starting secondarynamenode, logging to /home/hadoop/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-secondarynamenode-fedora.out starting jobtracker, logging to /home/hadoop/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-jobtracker-fedora.out localhost: starting tasktracker, logging to /home/hadoop/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-tasktracker-fedora.out hadoop@fedora logs]$ jps 2099 SecondaryNameNode 2184 JobTracker 1976 DataNode 2365 Jps 1877 NameNode 2289 TaskTracker可以看到所有Hadoop守護(hù)進(jìn)程均已啟動
再看看日志文件中有沒有報錯,如果沒有報錯,說明Hadoop已經(jīng)啟動成功了
[hadoop@fedora hadoop]$ hadoop dfsadmin -report Configured Capacity: 39474135040 (36.76 GB) Present Capacity: 33661652992 (31.35 GB) DFS Remaining: 33661612032 (31.35 GB) DFS Used: 40960 (40 KB) DFS Used%: 0% Under replicated blocks: 0 Blocks with corrupt replicas: 0 Missing blocks: 0------------------------------------------------- Datanodes available: 1 (1 total, 0 dead)Name: 192.168.211.129:50010 Decommission Status : Normal Configured Capacity: 39474135040 (36.76 GB) DFS Used: 40960 (40 KB) Non DFS Used: 5812482048 (5.41 GB) DFS Remaining: 33661612032(31.35 GB) DFS Used%: 0% DFS Remaining%: 85.28% Last contact: Thu Mar 06 09:48:17 CST 2014嘗試執(zhí)行Map/Reduce任務(wù)時也是沒問題的
轉(zhuǎn)載于:https://www.cnblogs.com/sean-zou/p/3709998.html
總結(jié)
以上是生活随笔為你收集整理的Linux下部署Hadoop伪分布模式的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
 
                            
                        