Spark & Scala - NullPointerException in RDD traversal -
i have number of csv files , need combine them rdd part of filenames.
for example, below files
$ ls 20140101_1.csv 20140101_3.csv 20140201_2.csv 20140301_1.csv 20140301_3.csv 20140101_2.csv 20140201_1.csv 20140201_3.csv
i need combine files names 20140101*.csv
rdd work on , on.
i using sc.wholetextfiles
read entire directory , grouping filenames patters form string of filenames. passing string sc.textfile open files single rdd.
this code have -
val files = sc.wholetextfiles("*.csv") val indexed_files = files.map(a => (a._1.split("_")(0),a._1)) val data = indexed_files.groupbykey data.map { => var name = a._2.mkstring(",") (a._1, name) } data.foreach { => var file = sc.textfile(a._2) println(file.count) }
and sparkexception - nullpointerexception
when try call textfile
. error stack refers iterator inside rdd. not able understand error -
15/07/21 15:37:37 info taskschedulerimpl: removed taskset 65.0, tasks have completed, pool org.apache.spark.sparkexception: job aborted due stage failure: task 1 in stage 65.0 failed 4 times, recent failure: lost task 1.3 in stage 65.0 (tid 115, 10.132.8.10): java.lang.nullpointerexception @ $iwc$$iwc$$iwc$$iwc$$iwc$$iwc$$iwc$$iwc$$anonfun$1.apply(<console>:33) @ $iwc$$iwc$$iwc$$iwc$$iwc$$iwc$$iwc$$iwc$$anonfun$1.apply(<console>:32) @ scala.collection.iterator$class.foreach(iterator.scala:727) @ scala.collection.abstractiterator.foreach(iterator.scala:1157) @ org.apache.spark.rdd.rdd$$anonfun$foreach$1$$anonfun$apply$28.apply(rdd.scala:870) @ org.apache.spark.rdd.rdd$$anonfun$foreach$1$$anonfun$apply$28.apply(rdd.scala:870) @ org.apache.spark.sparkcontext$$anonfun$runjob$5.apply(sparkcontext.scala:1765) @ org.apache.spark.sparkcontext$$anonfun$runjob$5.apply(sparkcontext.scala:1765)
however, when sc.textfile(data.first._2).count
in spark shell, able form rdd , able retrieve count.
any appreciated.
converting comment answer:
var file = sc.textfile(a._2)
inside foreach
of rdd isn't going work. can't nest rdds that.
Comments
Post a Comment