hadoop集群平台的搭建
環境配置:
master:192.168.1.20
slave1:192.168.1.21
slave2:192.168.1.22
準備工作:
#yum安裝需要的服務,關閉防火墻和selinux, yum -y install wget vim gcc net-tools curl lrzsz rsync yum updatesystemctl status firewalld systemctl stop firewalld systemctl disable firewalld vim /etc/selinux/config ##修改為disabledvim /etc/security/limits.conf #可以打開的文件數量,追加到尾部 * soft nofile 65536 # open files (-n) * hard nofile 65536 * soft nproc 65565 * hard nproc 65565 # max user processes (-u)更改hostname:
vim /etc/hostname #分別在master和兩個slave中刪除原來的本機hostname,添加 master/slave1/slave2更改hosts:
vim /etc/hosts #在三個主機中同樣追加: 192.168.1.20 master 192.168.1.21 slave1 192.168.1.22 slave2安裝jdk:
tar -xzvf /usr/local/src/jdk-16_linux-x64_bin.tar.gz -C /usr/local/ vim /etc/profile ##追加 export JAVA_HOME=/usr/local/jdk-16 export PATH=$PATH:$JAVA_HOME/bin export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar export JRE_HOME=$JAVA_HOME/jre ##刷新 source /etc/profile ##驗證 java -version新增hadoop用戶:
#三個機器都要新增 useradd hadoop #密碼為123456 passwd hadoop配置ssh無密碼驗證登錄【每個節點都需要操作】
切換到hadoop用戶:
[root@localhost ~]# su - hadoop [hadoop@localhost ~]$節點生成密鑰對:
ssh-keygen -t rsa -P ''或者直接 ssh-keygen #一直確認就可以 #查看hadoop目錄下是否生成無密碼密鑰對 [hadoop@localhost .ssh]$ cd /home/hadoop/.ssh [hadoop@localhost .ssh]$ ls id_rsa id_rsa.pub #將id_rsa.pub追加到授權key文件中 [hadoop@localhost .ssh]$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys [hadoop@localhost .ssh]$ ls -a . .. authorized_keys id_rsa id_rsa.pub `給authorized_keys修改權限` [hadoop@master .ssh]$ chmod 600 ~/.ssh/authorized_keys [hadoop@master .ssh]$ ll 總用量 16 -rw------- 1 hadoop hadoop 410 3月 25 20:43 authorized_keys -rw------- 1 hadoop hadoop 1679 3月 25 20:34 id_rsa -rw-r--r-- 1 hadoop hadoop 410 3月 25 20:34 id_rsa.pub -rw-r--r-- 1 hadoop hadoop 171 3月 25 20:50 known_hosts配置ssh服務:
`root用戶登錄` [root@localhost ~]# vim /etc/ssh/sshd_config #找到#PubkeyAuthentication yes 將#號去掉 PubkeyAuthentication yes重啟ssh服務:
systemctl restart sshd驗證ssh登錄本機:
`切換到hadoop用戶` su - hadoop [hadoop@master ~]$ ssh localhost #首次登錄主機時提示系統無法確認host主機的真實性,只知道它的公鑰指紋,詢問用戶是否需要繼續連接,此時輸入yes即可。下次登錄直接登錄,不需要輸入任何的確認和密碼,就表示配置ssh無密碼登錄成功交換ssh密鑰
在master和slave1,slave2 之間交換密鑰,實現master和slave的ssh無密碼登錄
將master節點的公鑰id_rsa.pub復制到每個slave節點:【在hadoop用戶下操作 】
在每個slave節點上把master節點復制的公鑰復制到authorized_keys文件中:
在slave1,slave2節點上登錄hadoop用戶操作:
在每個slave節點刪除master的公鑰文件id_rsa.pub:
rm -rf ~/id_rsa.pub將slave節點的公鑰文件保存到master:【每個slave都要操作一次】
#將slave節點的公鑰復制到master下: [hadoop@localhost .ssh]$ scp ~/.ssh/id_rsa.pub hadoop@master:~/ The authenticity of host 'master (192.168.1.20)' can't be established. ECDSA key fingerprint is SHA256:AlbOTMHeCJIgoXJOW7d9N9pSMRUs11+z++45WorTBKA. ECDSA key fingerprint is MD5:14:20:a8:b5:b0:b7:54:f7:5e:07:b2:0b:31:ee:6a:fc. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added 'master,192.168.1.20' (ECDSA) to the list of known hosts. hadoop@master's password: id_rsa.pub 100% 410 24.9KB/s 00:00 #在master節點將復制過來的slave公鑰復制到authorized_keys 文件中 [hadoop@master ~]$ cat ~/id_rsa.pub >> ~/.ssh/authorized_keys #刪除slave節點的公鑰文件 [hadoop@master ~]$ rm -rf ~/id_rsa.pub驗證:
查看master的authorized_keys 文件中有master,slave1,slave2共3個公鑰,slave1和slave2中有本身的公鑰和master的公鑰共2個。`在master上分別登錄兩個slave:` [hadoop@master .ssh]$ ssh hadoop@slave1 Last failed login: Thu Mar 25 22:18:20 CST 2021 from master on ssh:notty There was 1 failed login attempt since the last successful login. Last login: Thu Mar 25 22:10:12 2021 from localhost [hadoop@localhost ~]$ exit 登出 Connection to slave1 closed. [hadoop@master .ssh]$ ssh hadoop@slave2 Last login: Thu Mar 25 22:11:52 2021 from localhost [hadoop@localhost ~]$ exit 登出 `在slave上登錄mater:` [hadoop@localhost .ssh]$ ssh hadoop@master Last failed login: Thu Mar 25 22:42:12 CST 2021 from slave1 on ssh:notty There was 1 failed login attempt since the last successful login. Last login: Thu Mar 25 20:57:34 2021 from localhost [hadoop@master ~]$ exit 登出master節點安裝hadoop
下載,解壓縮,移動到/usr/local/下:
wget https://mirrors.bfsu.edu.cn/apache/hadoop/common/hadoop-3.2.2/hadoop-3.2.2.tar.gz tar -zxvf /usr/local/src/hadoop-3.2.2.tar.gz -C /usr/local/ mv /usr/local/hadoop-3.2.2 /usr/local/hadoop配置hadoop環境變量:
vim /etc/profile #追加 #hadoop export HADOOP_HOME=/usr/local/hadoop export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH #刷新 source /etc/profile #檢查 [root@master hadoop]# /usr/local/hadoop/bin/hdaoop version Hadoop 3.2.2 Source code repository Unknown -r 7a3bc90b05f257c8ace2f76d74264906f0f7a932 Compiled by hexiaoqiao on 2021-01-03T09:26Z Compiled with protoc 2.5.0 From source with checksum 5a8f564f46624254b27f6a33126ff4 This command was run using /usr/local/hadoop/share/hadoop/common/hadoop-common-3.2.2.jar修改hadoop-env.sh配置文件:
cd /usr/local/hadoop/etc/hadoop/ vim hadoop-env.sh #追加 export JAVA_HOME=/usr/local/jdk-16配置參數:
配置hdfs-site.xml配置文件:
#在<configuration> </configuration>中間添加<property><name>dfs.namenode.http-address</name><value>master:50070</value></property><property><name>dfs.namenode.name.dir</name><value>file:/usr/local/hadoop/dfs/name</value><description>hdfs的namenode在本地文件系統中的位置</description></property><property><name>dfs.namenode.data.dir</name><value>file:/usr/local/hadoop/dfs/data</value><description>hdfs的datanode在本地文件系統中的位置</description></property><property><name>dfs.replication</name><value>3</value><description>冗余副本數為3</description></property><property><name>dfs.namenode.secondary.http-address</name><value>192.168.1.20:50090</value><description>定義hdfs對應的http服務器的地址和端口</description></property><property><name>dfs.webhdfs.enabled</name><value>ture</value><description>是否通過http協議讀取hdfs的文件。如果是則集群的安全性較差</descrip tion></property>配置core-site.xml配置文件:
#在<configuration> </configuration>中間添加<property><name>fs.defaultFS</name><value>hdfs://192.168.1.20:9000</value><description>文件系統主機和端口</description></property><property><name>io.file.buffer.size</name><value>131072</value><description>流文件的緩存區大小為128M</description></property><property><name>hadoop.tmp.dir</name><value>file:/usr/local/hadoop/tmp</value><description>臨時文件夾(此項若是沒配置,則系統默認的臨時文件夾為/tmp/hadoop-hadoop.此目錄在linux系統重新啟動時會被刪除,必須重新執行hadoop系統格式化命令,否則hadoop運行會出錯)</description></property><!-- 當前用戶全設置成root --><property><name>hadoop.http.staticuser.user</name><value>root</value></property><!-- 不開啟權限檢查 --><property><name>dfs.permissions.enabled</name><value>false</value></property>配置mapred-site.xml:
<configuration><property><name>mapreduce.framework.name</name><value>yarn</value><description>默認local模式,classic,yarn。使用yarn是使用yarn集群來實現資源的分配</description></property><property><name>mapreduce.jobhistory.address</name><value>master:10020</value><description>定義作業歷史服務器的地址和端口,通過作業歷史服務來查看已經運行完成的mapreduce任務</description></property><property><name>mapreduce.jobhistory.webapp.address</name><value>master:19888</value><description>定義歷史服務器web應用訪問的地址和端口</description></property><property><name>mapreduce.application.classpath</name><value>/usr/local/hadoop/etc/hadoop,/usr/local/hadoop/share/hadoop/common/*,/usr/local/hadoop/share/hadoop/common/lib/*,/usr/local/hadoop/share/hadoop/hdfs/*,/usr/local/hadoop/share/hadoop/hdfs/lib/*,/usr/local/hadoop/share/hadoop/mapreduce/*,/usr/local/hadoop/share/hadoop/mapreduce/lib/*,/usr/local/hadoop/share/hadoop/yarn/*,/usr/local/hadoop/share/hadoop/yarn/lib/*</value></property> </configuration>配置yarn-site.xml:
<configuration><!-- Site specific YARN configuration properties --><property><name>yarn.resourcemanager.address</name><value>master:8032</value><description>RsourceManager提供給客戶端訪問的地址,客戶端通過該地址向RM提交應用程序,殺死應用程序等</description></property><property><name>yarn.resourcemanager.scheduler.address</name><value>master:8030</value><description>定義作業歷史服務器的地址和端口,通過歷史服務器來查看已經運行完的mapreduce作業記錄</description></property><property><name>yarn.resourcemanager.resource-tracker.address</name><value>master:8031</value><description>ResourceManager提供給nodemanager的地址,nodemanager通過該地址向MR匯報心跳,領取任務等</description></property><property><name>yarn.resourcemanager.admin.address</name><value>master:8033</value><description>resourcemanager提供給管理員的地址,管理員可以通過該地址向RM發送管理命令</description></property><property><name>yarn.resourcemanager.webapp.address</name><value>master:8088</value><description>resourcemanager對web服務器提供的地址,用戶可以通過該地址在瀏覽器中查看集群的各類信息</description></property><property><name>yarn.nodemanager.aux-services</name><value>mapreduce_shuffle</value><description></description></property><property><name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name><value>org.apache.hadoop.mapred.ShuffleHandler</value><description>通過該配置,用戶可以自定義一些服務,例如,map-reduce的shuffle功能就是采用這種方式來實現的,這樣就可以在nodemanager上擴展自己的服務</description></property> </configuration>hadoop其他的相關配置:
7.在master和slave節點上切換到Hadoop用戶
su - hadoopmaster節點進行namenode格式化
將namenode上的數據清零,第一次啟動HDFS時要進行格式化,以后啟動無須進行格式化,否則會datanode丟失。另外,只要運行過HDFS,hadoop的工作目錄就會有數據,如果需要重新格式化,則需要在格式化前刪除工作目錄的數據,否則會出問題。
[hadoop@master ~]$ /usr/local/hadoop/bin/hdfs namenode -format WARNING: /usr/local/hadoop/logs does not exist. Creating. 2021-03-26 19:22:49,107 INFO namenode.NameNode: STARTUP_MSG: /************************************************************ STARTUP_MSG: Starting NameNode STARTUP_MSG: host = master/192.168.1.20 STARTUP_MSG: args = [-format] STARTUP_MSG: version = 3.2.2 ..............................................略 2021-03-27 09:53:13,285 INFO common.Storage: Storage directory /usr/local/hadoop/dfs/name has been successfully formatted. ............................略 2021-03-26 19:22:50,686 INFO namenode.NameNode: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down NameNode at master/192.168.1.20 ************************************************************/啟動namenode:
[hadoop@master hadoop]$ /usr/local/hadoop/sbin/hadoop-daemon.sh start namenode WARNING: Use of this script to start HDFS daemons is deprecated. WARNING: Attempting to execute replacement "hdfs --daemon start" instead.查看java進程:
[hadoop@master hadoop]$ jps 1625 NameNode 1691 Jps啟動datanode:
[hadoop@master hadoop]$ /usr/local/hadoop/sbin/hadoop-daemon.sh start datanode WARNING: Use of this script to start HDFS daemons is deprecated. WARNING: Attempting to execute replacement "hdfs --daemon start" instead. [hadoop@master hadoop]$ jps 1825 Jps 1762 DataNode 1625 NameNode啟動secondarynamenode:
[hadoop@master hadoop]$ /usr/local/hadoop/sbin/hadoop-daemon.sh start secondarynamenode WARNING: Use of this script to start HDFS daemons is deprecated. WARNING: Attempting to execute replacement "hdfs --daemon start" instead. [hadoop@master hadoop]$ jps 1762 DataNode 1893 SecondaryNameNode 1926 Jps 1625 NameNode檢查集群是否連接成功:
[hadoop@master sbin]$ hdfs dfsadmin -report .........略 Live datanodes (1):Name: 192.168.1.20:9866 (master) Hostname: master Decommission Status : Normal Configured Capacity: 30041706496 (27.98 GB) DFS Used: 8192 (8 KB) Non DFS Used: 4121952256 (3.84 GB) DFS Remaining: 25919746048 (24.14 GB) DFS Used%: 0.00% DFS Remaining%: 86.28% Configured Cache Capacity: 0 (0 B) Cache Used: 0 (0 B) Cache Remaining: 0 (0 B) Cache Used%: 100.00% Cache Remaining%: 0.00% Xceivers: 1 Last contact: Sat Mar 27 13:36:07 CST 2021 Last Block Report: Sat Mar 27 13:22:55 CST 2021 Num of Blocks: 0出錯了,沒連接到兩個slave節點。
解決方式:
#一鍵停止服務 `/usr/local/hadoop/sbin/start-all.sh 刪除前面格式化和啟動服務產生的數據[刪除下面目錄中的所有文件]: /usr/local/hadoop/logs/ /usr/local/hadoop/dfs/data/ /usr/local/hadoop/dfs/name/ /usr/local/hadoop/tmp/ 重新格式化master 啟動服務 再次檢查` [hadoop@master sbin]$ hdfs dfsadmin -report Configured Capacity: 92271554560 (85.93 GB) Present Capacity: 81996009472 (76.36 GB) DFS Remaining: 81995984896 (76.36 GB) DFS Used: 24576 (24 KB) DFS Used%: 0.00% Replicated Blocks:Under replicated blocks: 0Blocks with corrupt replicas: 0Missing blocks: 0Missing blocks (with replication factor 1): 0Low redundancy blocks with highest priority to recover: 0Pending deletion blocks: 0 Erasure Coded Block Groups: Low redundancy block groups: 0Block groups with corrupt internal blocks: 0Missing block groups: 0Low redundancy blocks with highest priority to recover: 0Pending deletion blocks: 0------------------------------------------------- Live datanodes (3):Name: 192.168.1.20:9866 (master) Hostname: master Decommission Status : Normal Configured Capacity: 30041706496 (27.98 GB) DFS Used: 8192 (8 KB) Non DFS Used: 4121952256 (3.84 GB) DFS Remaining: 25919746048 (24.14 GB) DFS Used%: 0.00% DFS Remaining%: 86.28% Configured Cache Capacity: 0 (0 B) Cache Used: 0 (0 B) Cache Remaining: 0 (0 B) Cache Used%: 100.00% Cache Remaining%: 0.00% Xceivers: 1 Last contact: Sat Mar 27 13:36:07 CST 2021 Last Block Report: Sat Mar 27 13:22:55 CST 2021 Num of Blocks: 0Name: 192.168.1.21:9866 (slave1) Hostname: slave1 Decommission Status : Normal Configured Capacity: 31114924032 (28.98 GB) DFS Used: 8192 (8 KB) Non DFS Used: 3144146944 (2.93 GB) DFS Remaining: 27970768896 (26.05 GB) DFS Used%: 0.00% DFS Remaining%: 89.90% Configured Cache Capacity: 0 (0 B) Cache Used: 0 (0 B) Cache Remaining: 0 (0 B) Cache Used%: 100.00% Cache Remaining%: 0.00% Xceivers: 1 Last contact: Sat Mar 27 13:36:07 CST 2021 Last Block Report: Sat Mar 27 13:22:49 CST 2021 Num of Blocks: 0Name: 192.168.1.22:9866 (slave2) Hostname: slave2 Decommission Status : Normal Configured Capacity: 31114924032 (28.98 GB) DFS Used: 8192 (8 KB) Non DFS Used: 3009445888 (2.80 GB) DFS Remaining: 28105469952 (26.18 GB) DFS Used%: 0.00% DFS Remaining%: 90.33% Configured Cache Capacity: 0 (0 B) Cache Used: 0 (0 B) Cache Remaining: 0 (0 B) Cache Used%: 100.00% Cache Remaining%: 0.00% Xceivers: 1 Last contact: Sat Mar 27 13:36:07 CST 2021 Last Block Report: Sat Mar 27 13:22:49 CST 2021 Num of Blocks: 0停止服務:
/usr/local/hadoop/sbin/hadoop-daemon.sh stop secondarynamenode/usr/local/hadoop/sbin/hadoop-daemon.sh stop datanode/usr/local/hadoop/sbin/hadoop-daemon.sh stop namenode一鍵開始和停止服務:
/usr/local/hadoop/sbin/start-all.sh #開啟hadoop服務(namenode,datanode,secondarynamenode) /usr/local/hadoop/sbin/stop-all.sh #停止hadoop服務web端查看集群:
http://192.168.1.20:50070/
運行hadoop的wordcount進行測試
先建hdfs文件系統中的/input目錄:
[hadoop@master sbin]$ hdfs dfs -mkdir /input [hadoop@master sbin]$ hdfs dfs -ls / Found 1 items drwxr-xr-x - hadoop supergroup 0 2021-03-27 16:11 /input [hadoop@master sbin]$將輸入數據文件復制放入到hdfs的/input目錄中:
[hadoop@master sbin]$ hdfs dfs -put /chenfeng/pzs.log /input [hadoop@master sbin]$ hdfs dfs -ls /input Found 1 items -rw-r--r-- 3 hadoop supergroup 199205376 2021-03-28 22:31 /input/pzs.log [hadoop@master sbin]$在瀏覽器查看:
沒有看到文件,而且還報錯:
解決方法1:
javax.activiation 文件的下載鏈接
下載的時候挑評星多的下載
解決方法2:直接替換java版本為jdk8:
成功解決:
運行wordcount案例:
若是hdfs系統中存在/output目錄,先刪除,要不然在運行案例時無法生存新的/output目錄會執行失敗。
測試開始:
[hadoop@master sbin]$ hadoop jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.2.2.jar wordcount /input/pzs.log /output `2021-03-28 22:56:16,003 INFO client.RMProxy: Connecting to ResourceManager at master/192.168.1.20:8032 2021-03-28 22:56:17,202 INFO ipc.Client: Retrying connect to server: master/192.168.1.20:8032. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2021-03-28 22:56:18,203 INFO ipc.Client: Retrying connect to server: master/192.168.1.20:8032. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2021-03-28 22:56:19,203 INFO ipc.Client: Retrying connect to server: master/192.168.1.20:8032. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)`出現錯誤:
根據錯誤 【Retrying connect to server: master/192.168.1.20:8032. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)】 發現原來是yarn沒有啟動,yarn啟動會有兩個進程: resourcemanager nodemanagers啟動yarn:
[hadoop@master sbin]$ start-yarn.sh Starting resourcemanager Starting nodemanagers查看是否啟動成功:
[hadoop@master sbin]$ jps 4357 Jps 1750 DataNode 1910 SecondaryNameNode 1630 NameNode啟動不成功
同時查看日志:
[root@master logs]# tailf hadoop-hadoop-resourcemanager-master.log [root@master logs]# tailf hadoop-hadoop-nodemanager-master.log在日志中有這樣的輸出:
` 2021-03-28 23:35:04,775 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: registered UNIX signal handlers for [TERM, HUP, INT] 2021-03-28 23:35:05,212 INFO org.apache.hadoop.conf.Configuration: found resource core-site.xml at file:/usr/local/hadoop/etc/hadoop/core-site.xml 2021-03-28 23:35:05,317 INFO org.apache.hadoop.conf.Configuration: resource-types.xml not found 2021-03-28 23:35:05,317 INFO org.apache.hadoop.yarn.util.resource.ResourceUtils: Unable to find 'resource-types.xml'. 2021-03-28 23:35:05,348 INFO org.apache.hadoop.conf.Configuration: found resource yarn-site.xml at file:/usr/local/hadoop/etc/hadoop/yarn-site.xml 2021-03-28 23:35:05,350 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.yarn.server.resourcemanager.RMFatalEventType for class org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMFatalEventDispatcher 2021-03-28 23:35:05,390 INFO org.apache.hadoop.yarn.server.resourcemanager.security.NMTokenSecretManagerInRM: NMTokenKeyRollingInterval: 86400000ms and NMTokenKeyActivationDelay: 900000ms 2021-03-28 23:35:05,392 INFO org.apache.hadoop.yarn.server.resourcemanager.security.RMContainerTokenSecretManager: ContainerTokenKeyRollingInterval: 86400000ms and ContainerTokenKeyActivationDelay: 900000ms `提示找不到resource-types.xml
原因,hadoop3.XXXX中需要配置各種的環境變量:
解決辦法:
修改配置:
停止服務:
啟動服務:
[hadoop@master sbin]$ /usr/local/hadoop/sbin/start-all.sh繼續執行wordcount案例:
[hadoop@master sbin]$ hadoop jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.2.2.jar wordcount /input/pslstreaming_log1.txt /output 2021-03-30 10:25:36,653 INFO client.RMProxy: Connecting to ResourceManager at master/192.168.1.20:8032 2021-03-30 10:25:37,432 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /tmp/hadoop-yarn/staging/hadoop/.staging/job_1617071112579_0001 2021-03-30 10:25:37,979 INFO input.FileInputFormat: Total input files to process : 1 2021-03-30 10:25:38,225 INFO mapreduce.JobSubmitter: number of splits:1 2021-03-30 10:25:38,582 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1617071112579_0001 2021-03-30 10:25:38,583 INFO mapreduce.JobSubmitter: Executing with tokens: [] 2021-03-30 10:25:38,712 INFO conf.Configuration: resource-types.xml not found 2021-03-30 10:25:38,712 INFO resource.ResourceUtils: Unable to find 'resource-types.xml'. 2021-03-30 10:25:39,091 INFO impl.YarnClientImpl: Submitted application application_1617071112579_0001 2021-03-30 10:25:39,123 INFO mapreduce.Job: The url to track the job: http://master:8088/proxy/application_1617071112579_0001/ 2021-03-30 10:25:39,123 INFO mapreduce.Job: Running job: job_1617071112579_0001 2021-03-30 10:25:46,264 INFO mapreduce.Job: Job job_1617071112579_0001 running in uber mode : false 2021-03-30 10:25:46,264 INFO mapreduce.Job: map 0% reduce 0% 2021-03-30 10:25:53,462 INFO mapreduce.Job: map 100% reduce 0% 2021-03-30 10:25:58,543 INFO mapreduce.Job: map 100% reduce 100% 2021-03-30 10:25:59,555 INFO mapreduce.Job: Job job_1617071112579_0001 completed successfully 2021-03-30 10:25:59,617 INFO mapreduce.Job: Counters: 54 .........略不明白為什么這個錯誤還是存在【resource.ResourceUtils: Unable to find ‘resource-types.xml’.】
查看輸出文件:
[hadoop@master sbin]$ hdfs dfs -cat /output/part-r-00000|head "", 308 ""], 2 "9716168072", 601 "9716168072"}, 1 "?arrc=2&linkmode=7", 1 "Count=2 299 "a50_inactive_threshold": 300 "a50_refresh_interval": 119 "a50_state_check_interval": 300 "app_private_data": 299 cat: Unable to write to output stream. #太多,只輸出10行看一下在網頁端查看:
在網頁中新建文件時出現錯誤,不能新建文件和目錄:
問題的分析:
我在瀏覽器查看目錄和刪除目錄及文件,為什么會是dr.who,dr.who其實是hadoop中http訪問的靜態用戶名,并沒有啥特殊含義,可以在core-default.xml中看到其配置
我們可以通過修改core-site.xml,配置為當前用戶,
<property><name>hadoop.http.staticuser.user</name><value>hadoop</value></property>另外,通過查看hdfs的默認配置hdfs-default.xml發現hdfs默認是開啟權限檢查的。
dfs.permissions.enabled=true #是否在HDFS中開啟權限檢查,默認為true解決方法一:
直接修改/user目錄的權限設置,操作如下:
查看:
修改權限前:
修改權限后:
創建chenfeng目錄:
輸入192.168.1.20:8088查看yarn集群中運行的作業:
總結
以上是生活随笔為你收集整理的hadoop集群平台的搭建的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: shell批量修改文件名字 重命名 MD
- 下一篇: 注释