Intellipaat Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Big Data Hadoop & Spark by (11.4k points)

I am trying to use Spark Cassandra Connector in Spark 1.1.0.

I have successfully built the jar file from the master branch on GitHub and have gotten the included demos to work. However, when I try to load the jar files into the spark-shell I can't import any of the classes from the com.datastax.spark.connector package.

I have tried using the --jars option on spark-shell and adding the directory with the jar file to Java's CLASSPATH. Neither of these options work. In fact, when I use the --jars option, the logging output shows that the Datastax jar is getting loaded, but I still cannot import anything from com.datastax.

1 Answer

0 votes
by (32.3k points)

To access Cassandra from the spark-shell, I've found an assembly out of the cassandra-spark-driver with all dependencies (an "uberjar"). Providing it to the spark-shell using the --jars option like this:

spark-shell --jars spark-cassandra-assembly-1.0.0-SNAPSHOT-jar-with-dependencies.jar

I was also facing this similar issue but I resolved it using this method, it is both simple and convenient as compared to loading the long list of dependencies.

I found a gist with the POM file that you can download from here.

Now, in order to use the pom to create the uberjar you should do:

mvn package

Note: If you're using sbt, look into the sbt-assembly plugin.

Related questions

31k questions

32.8k answers

501 comments

693 users

Browse Categories

...