Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Big Data Hadoop & Spark by (11.4k points)

I try to execute simple project with Apache Spark. This is my code SimpleApp.scala

/* SimpleApp.scala */
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf

object SimpleApp {
  def main(args: Array[String]) {
    val logFile = "/home/hduser/spark-1.2.0-bin-hadoop2.4/README.md" // Should be some file on your system
    // val conf = new SparkConf().setAppName("Simple Application")
    val sc = new SparkContext("local", "Simple Job", "/home/hduser/spark-1.2.0-bin-hadoop2.4/")
    val logData = sc.textFile(logFile, 2).cache()
    val numAs = logData.filter(line => line.contains("hadoop")).count()
    val numBs = logData.filter(line => line.contains("see")).count()
    println("Lines with hadoop: %s, Lines with see: %s".format(numAs, numBs))
  }
}


when I manually send this job to Spark with command line :

 /home/hduser/spark-1.2.0-hadoop-2.4.0/bin/spark-submit --class "SimpleApp" --master local[4] target/scala-2.10/simple-project_2.10-1.0.jar 

it's run successfully.

if I run with sbt run and with the service apache spark is running, it's success, but in the end of log it give error like this :

15/02/06 15:56:49 ERROR Utils: Uncaught exception in thread SparkListenerBus
java.lang.InterruptedException
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:996)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1303)
    at java.util.concurrent.Semaphore.acquire(Semaphore.java:317)
    at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(LiveListenerBus.scala:48)
    at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1.apply(LiveListenerBus.scala:47)
    at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1.apply(LiveListenerBus.scala:47)
    at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1460)
    at org.apache.spark.scheduler.LiveListenerBus$$anon$1.run(LiveListenerBus.scala:46)
15/02/06 15:56:49 ERROR ContextCleaner: Error in cleaning thread
java.lang.InterruptedException
    at java.lang.Object.wait(Native Method)
    at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135)
    at org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1.apply$mcV$sp(ContextCleaner.scala:136)
    at org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1.apply(ContextCleaner.scala:134)
    at org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1.apply(ContextCleaner.scala:134)
    at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1460)
    at org.apache.spark.ContextCleaner.org$apache$spark$ContextCleaner$$keepCleaning(ContextCleaner.scala:133)
    at org.apache.spark.ContextCleaner$$anon$3.run(ContextCleaner.scala:65)

1 Answer

0 votes
by (32.3k points)

I guess you are using Spark<2.x.x. but you can safely disregard this error, as this is printed at the end of the execution when we clean up and kill the daemon context cleaning thread.

In newer versions, this problem is resolved as this particular message is silenced because it was creating confusion among many users.

Browse Categories

...