scala - How can I remove an RDD from a DStream in Spark Streaming? -
I would like to leave NDO before a DSTream. I tried to use the following function with the change, but it does not work (Error OneForOneStrategy: org.apache.spark.SparkContext java.io.NotSerializableException), and I do not think it will fulfill my real goal of removing RDDs because it will return empty spaces.
var num = 0 def dropNrdds (myRDD: RDD [(string, ante]], drop-nim: ITD: RDD [(string, int)] = {if (num & lt ; DropNum) {num = num + 1 myRDD} Other {return sc.makeRDD (Seq ())}
error is because your function refers to your var num
and containing the class serialjob
your function should be called by different nodes of the cluster So, depending on who it is, the Serializable
, and you can not share a variable between different invocations of your function (since it may be running on different cluster nodes).
It seems that you want to drop a specific number of RDD
s from Dstream
, just like a special dev / s
Is split up, it's an implementation extension. Maybe the time-based piece
can be done to do what you want?
Comments
Post a Comment