Spark standalone cluster looks it's running without a problem :
I followed one tutorial.
I have built a fat jar for running this JavaApp on the cluster. Before maven package:
find .
./pom.xml
./src
./src/main
./src/main/java
./src/main/java/SimpleApp.java
content of SimpleApp.java is :
import org.apache.spark.api.java.*;
import org.apache.spark.api.java.function.Function;
import org.apache.spark.SparkConf;
import org.apache.spark.SparkContext;
public class SimpleApp {
public static void main(String[] args) {
SparkConf conf = new SparkConf()
.setMaster("spark://10.35.23.13:7077")
.setAppName("My app")
.set("spark.executor.memory", "1g");
JavaSparkContext sc = new JavaSparkContext (conf);
String logFile = "/home/ubuntu/spark-0.9.1/test_data";
JavaRDD<String> logData = sc.textFile(logFile).cache();
long numAs = logData.filter(new Function<String, Boolean>() {
public Boolean call(String s) { return s.contains("a"); }
}).count();
System.out.println("Lines with a: " + numAs);
}
}
This program only works when master is set as setMaster("local"). Otherwise I get this error
$java -cp path_to_file/simple-project-1.0-allinone.jar SimpleApp