scala - How can I remove an RDD from a DStream in Spark Streaming? -


I would like to leave NDO before a DSTream. I tried to use the following function with the change, but it does not work (Error OneForOneStrategy: org.apache.spark.SparkContext java.io.NotSerializableException), and I do not think it will fulfill my real goal of removing RDDs because it will return empty spaces.

  var num = 0 def dropNrdds (myRDD: RDD [(string, ante]], drop-nim: ITD: RDD [(string, int)] = {if (num & lt ; DropNum) {num = num + 1 myRDD} Other {return sc.makeRDD (Seq ())}  

error is because your function refers to your var num and containing the class serialjob your function should be called by different nodes of the cluster So, depending on who it is, the Serializable , and you can not share a variable between different invocations of your function (since it may be running on different cluster nodes).

It seems that you want to drop a specific number of RDD s from Dstream , just like a special dev / s Is split up, it's an implementation extension. Maybe the time-based piece can be done to do what you want?


Comments

Popular posts from this blog

mysql - How to enter php data into a html multiple select box -

java - Can't add JTree to JPanel of a JInternalFrame -

c++ - Cassandra datastax cpp driver - avoiding unnecessary copies -