spark任务jvm内存溢出
生活随笔
收集整理的這篇文章主要介紹了
spark任务jvm内存溢出
小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.
使用提交命令
park-submit \--master yarn --deploy-mode client \--driver-memory 2g \--executor-memory 4g \--executor-cores 4 \--num-executors 3運(yùn)行后任務(wù)報(bào)錯(cuò)
2020-09-09 10:53:43 INFO TaskSetManager:54 - Starting task 12.0 in stage 6.0 (TID 420, data-03.com, executor 2, partition 12, NODE_LOCAL, 8637 bytes) 2020-09-09 10:53:43 WARN TaskSetManager:66 - Lost task 2.0 in stage 6.0 (TID 410, data-03.com, executor 2): java.lang.OutOfMemoryError: GC overhead limit exceededat java.lang.StringCoding.encode(StringCoding.java:350)at java.lang.String.getBytes(String.java:941)at org.apache.spark.unsafe.types.UTF8String.fromString(UTF8String.java:110)at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.serializefromobject_doConsume$(Unknown Source)at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.processNext(Unknown Source)at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$10$$anon$1.hasNext(WholeStageCodegenExec.scala:614)at org.apache.spark.sql.execution.UnsafeExternalRowSorter.sort(UnsafeExternalRowSorter.java:216)at org.apache.spark.sql.execution.exchange.ShuffleExchangeExec$$anonfun$2.apply(ShuffleExchangeExec.scala:295)at org.apache.spark.sql.execution.exchange.ShuffleExchangeExec$$anonfun$2.apply(ShuffleExchangeExec.scala:266)at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:830)at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:830)at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)at org.apache.spark.scheduler.Task.run(Task.scala:109)at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)at java.lang.Thread.run(Thread.java:745)2020-09-09 10:53:44 INFO YarnSchedulerBackend$YarnDriverEndpoint:54 - Disabling executor 2. 2020-09-09 10:53:44 INFO DAGScheduler:54 - Executor lost: 2 (epoch 0) 2020-09-09 10:53:44 INFO BlockManagerMasterEndpoint:54 - Trying to remove executor 2 from BlockManagerMaster. 2020-09-09 10:53:44 INFO BlockManagerMasterEndpoint:54 - Removing block manager BlockManagerId(2, data-03.com, 21483, None) 2020-09-09 10:53:44 INFO BlockManagerMaster:54 - Removed 2 successfully in removeExecutor 2020-09-09 10:53:44 INFO DAGScheduler:54 - Shuffle files lost for executor: 2 (epoch 0) 2020-09-09 10:53:45 WARN YarnSchedulerBackend$YarnSchedulerEndpoint:66 - Requesting driver to remove executor 2 for reason Container marked as failed: container_1597910745756_0040_01_000003 on host: data-03.com. Exit status: 143. Diagnostics: Container killed on request. Exit code is 143 Container exited with a non-zero exit code 143 Killed by external signal可以看到報(bào)的是JVM的內(nèi)存溢出,這種情況下一般都是分配給任務(wù)使用的內(nèi)存過小,可以縮小每個(gè)executor上的executor-cores也可以增大 executor-memory
總結(jié)
以上是生活随笔為你收集整理的spark任务jvm内存溢出的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: spark-on-yarn日志配置
- 下一篇: spark-jar冲突解决方案