dataframe 转rdd java,在pyspark中将RDD转换为Dataframe
我想在pyspark中將我的RDD轉(zhuǎn)換為Dataframe .
我的RDD:
[(['abc', '1,2'], 0), (['def', '4,6,7'], 1)]
我希望RDD以Dataframe的形式:
Index Name Number
0 abc [1,2]
1 def [4,6,7]
我試過(guò)了:
rd2=rd.map(lambda x,y: (y, x[0] , x[1]) ).toDF(["Index", "Name" , "Number"])
但我收到了錯(cuò)誤
An error occurred while calling
z:org.apache.spark.api.python.PythonRDD.runJob.
: org.apache.spark.SparkException: Job aborted due to stage failure:
Task 0 in stage 62.0 failed 1 times, most recent failure: Lost task 0.0
in stage 62.0 (TID 88, localhost, executor driver):
org.apache.spark.api.python.PythonException: Traceback (most recent
call last):
你能讓我知道嗎,我哪里錯(cuò)了?
更新:
rd2=rd.map(lambda x: (x[1], x[0][0] , x[0][1]))
我有以下形式的RDD:
[(0, 'abc', '1,2'), (1, 'def', '4,6,7')]
要轉(zhuǎn)換為Dataframe:
rd2.toDF(["Index", "Name" , "Number"])
它仍然給我錯(cuò)誤:
An error occurred while calling o2271.showString.
: java.lang.IllegalStateException: SparkContext has been shutdown
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2021)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2050)
總結(jié)
以上是生活随笔為你收集整理的dataframe 转rdd java,在pyspark中将RDD转换为Dataframe的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問(wèn)題。
- 上一篇: c 和java 内存,C分配和内存开销
- 下一篇: php安全性差,PHP安全性防范方式