Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Big Data Hadoop & Spark by (11.4k points)

What causes this Serialization error in Apache Spark 1.4.0 when calling:

sc.parallelize(strList, 4)


This exception is thrown:

com.fasterxml.jackson.databind.JsonMappingException:

Could not find creator property with name 'id' (in class org.apache.spark.rdd.RDDOperationScope)

 

Thrown from addBeanProps in Jackson: com.fasterxml.jackson.databind.deser.BeanDeserializerFactory#addBeanProps

The RDD is a Seq[String], and the #partitions doesn't seem to matter (tried 1, 2, 4).

There is no serialization stack trace, as normal the worker closure cannot be serialized.

What is another way to track this down?

1 Answer

0 votes
by (32.3k points)

I also had this issue once, here's a snippet from my sbt file and the 'dependencyOverrides' section which fixed it.

libraryDependencies ++= Seq(

  "com.amazonaws" % "amazon-kinesis-client" % "1.4.0",

  "org.apache.spark" %% "spark-core" % "1.4.0",

  "org.apache.spark" %% "spark-streaming" % "1.4.0",

  "org.apache.spark" %% "spark-streaming-kinesis-asl" % "1.4.0",

  "com.amazonaws" % "aws-java-sdk" % "1.10.2"

)

dependencyOverrides ++= Set(

  "com.fasterxml.jackson.core" % "jackson-databind" % "2.4.4"

)

Browse Categories

...