【原创】大叔经验分享(6)Oozie如何查看提交到Yarn上的任务日志
通過oozie job id可以查看流程詳細信息,命令如下:
oozie job -info?0012077-180830142722522-oozie-hado-W
?
流程詳細信息如下:
Job ID :?0012077-180830142722522-oozie-hado-W
------------------------------------------------------------------------------------------------------------------------------------
Workflow Name : test_wf
App Path????? :?hdfs://hdfs_name/oozie/test_wf.xml
Status??????? : KILLED
Run?????????? : 0
User????????? : hadoop
Group???????? : -
Created?????? : 2018-09-25 02:51 GMT
Started?????? : 2018-09-25 02:51 GMT
Last Modified : 2018-09-25 02:53 GMT
Ended???????? : 2018-09-25 02:53 GMT
CoordAction ID: -
?
Actions
------------------------------------------------------------------------------------------------------------------------------------
ID??????????????????????????????????????????????????????????????????????????? Status??? Ext ID???????????????? Ext Status Err Code?
------------------------------------------------------------------------------------------------------------------------------------
0012077-180830142722522-oozie-hado-W@:start:????????????????????????????????? OK??????? -????????????????????? OK???????? -????????
------------------------------------------------------------------------------------------------------------------------------------
0012077-180830142722522-oozie-hado-W@test_spark_task? ? ? ? ? ? ? ? ? ERROR?????application_1537326594090_5663FAILED/KILLEDJA018????
------------------------------------------------------------------------------------------------------------------------------------
0012077-180830142722522-oozie-hado-W@Kill???????????????????????????????????? OK? ??????-????????????????????? OK???????? E0729????
------------------------------------------------------------------------------------------------------------------------------------
?
失敗的任務定義如下
<action name="test_spark_task">?
??????? <spark xmlns="uri:oozie:spark-action:0.1">?
??????????? <job-tracker>${job_tracker}</job-tracker>?
??????????? <name-node>${name_node}</name-node>?
??????????? <master>${jobmaster}</master>?
??????????? <mode>${jobmode}</mode>?
??????????? <name>${jobname}</name>?
??????????? <class>${jarclass}</class>?
??????????? <jar>${jarpath}</jar>?
??????????? <spark-opts>--executor-memory 4g --executor-cores 2 --num-executors 4 --driver-memory 4g</spark-opts>?
??????? </spark>
?
在yarn上可以看到application_1537326594090_5663對應的application如下
application_1537326594090_5663?????? hadoop oozie:launcher:T=spark:W=test_wf:A=test_spark_task:ID=0012077-180830142722522-oozie-hado-W???????? Oozie Launcher
?
查看application_1537326594090_5663日志發現
2018-09-25 10:52:05,237 [main] INFO? org.apache.hadoop.yarn.client.api.impl.YarnClientImpl? - Submitted application?application_1537326594090_5664
?
yarn上application_1537326594090_5664對應的application如下
application_1537326594090_5664?????? hadoop??? TestSparkTask SPARK
?
即application_1537326594090_5664才是Action對應的spark任務,為什么中間會多一步,類結構和核心代碼詳見?https://www.cnblogs.com/barneywill/p/9895225.html
簡要來說,Oozie執行Action時,即ActionExecutor(最主要的子類是JavaActionExecutor,hive、spark等action都是這個類的子類),JavaActionExecutor首先會提交一個LauncherMapper(map任務)到yarn,其中會執行LauncherMain(具體的action是其子類,比如JavaMain、SparkMain等),spark任務會執行SparkMain,在SparkMain中會調用org.apache.spark.deploy.SparkSubmit來提交任務
?
如果提交的是spark任務,那么按照上邊的方法就可以跟蹤到實際任務的applicationId;
如果你提交的hive2任務,實際是用beeline啟動,從hive2開始,beeline命令的日志已經簡化,不像hive命令可以看到詳細的applicationId和進度,這時有兩種方法:
1)修改hive代碼,使得beeline命令和hive命令一樣有詳細日志輸出
詳見:https://www.cnblogs.com/barneywill/p/10185949.html
2)根據application tag手工查找任務
oozie在使用beeline提交任務時,會添加一個mapreduce.job.tags參數,比如
--hiveconf
mapreduce.job.tags=oozie-9f896ad3d40c261235dc6858cadb885c
但是這個tag從yarn application命令中查不到,只能手工逐個查找(實際啟動的任務會在當前LuancherMapper的applicationId上遞增),
然后就可以看到實際啟動的applicationId了
?
另外還可以從job history server上看到application的詳細信息,比如configuration、task等
查看hive任務執行的完整sql詳見:https://www.cnblogs.com/barneywill/p/10083731.html
?
轉載于:https://www.cnblogs.com/barneywill/p/10109487.html
總結
以上是生活随笔為你收集整理的【原创】大叔经验分享(6)Oozie如何查看提交到Yarn上的任务日志的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: .NET 面向对象基础
- 下一篇: Python魔术世界 1 如何使用Vis