Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Big Data Hadoop & Spark by (11.4k points)

I'm trying to execute a Spark Streaming example with Twitter as the source as follows:

public static void main (String.. args) {

    SparkConf conf = new SparkConf().setAppName("Spark_Streaming_Twitter").setMaster("local");
        JavaSparkContext sc = new JavaSparkContext(conf);      
        JavaStreamingContext jssc = new JavaStreamingContext(sc, new Duration(2));     
        JavaSQLContext sqlCtx = new JavaSQLContext(sc);    


        String[] filters = new String[] {"soccer"};

        JavaReceiverInputDStream<Status> receiverStream = TwitterUtils.createStream(jssc,filters);

         jssc.start();
         jssc.awaitTermination();

}


But I'm getting the following exception

Exception in thread "main" java.lang.AssertionError: assertion failed: No output streams registered, so nothing to execute
    at scala.Predef$.assert(Predef.scala:179)
    at org.apache.spark.streaming.DStreamGraph.validate(DStreamGraph.scala:158)
    at org.apache.spark.streaming.StreamingContext.validate(StreamingContext.scala:416)
    at org.apache.spark.streaming.StreamingContext.start(StreamingContext.scala:437)
    at org.apache.spark.streaming.api.java.JavaStreamingContext.start(JavaStreamingContext.scala:501)
    at org.learning.spark.TwitterStreamSpark.main(TwitterStreamSpark.java:53)


Any suggestion how to fix this issue?

1 Answer

0 votes
by (32.3k points)

According to the Spark official documentation's Output Operations on DStreams

Output operations allow DStream's data to be pushed out to external systems like a database or a file systems. Since the output operations actually allow the transformed data to be consumed by external systems, they trigger the actual execution of all the DStream transformations (similar to actions for RDDs).

Basically without an output operator you have "no output streams registered”, so there is nothing to execute. Without output operator on DStream no computation is invoked.

And when an output operator is called, it triggers the computation of a stream.

So, you have to invoke any of below method on stream:

print()

foreachRDD(func)

saveAsObjectFiles(prefix, [suffix])

saveAsTextFiles(prefix, [suffix])

saveAsHadoopFiles(prefix, [suffix])

 Also, you can apply any transformations first and then output functions too if required.

Browse Categories

...