0 votes
1 view
in Data Science by (17.6k points)

I am trying to Deploy an Recommendation Engine as mentioned in quick start guide. I completed the steps up to build the engine. Now I want to train the Recommendation Engine. I did as mentioned in quick start guide. (execute pio train). Then I got the lengthy error log and I couldn't paste all here. So I am putting first few rows of the error.

[INFO] [Console$] Using existing engine manifest JSON at /home/PredictionIO/PredictionIO-0.9.6/bin/MyRecommendation/manifest.json

[INFO] [Runner$] Submission command: /home/PredictionIO/PredictionIO-0.9.6/vendors/spark-1.5.1-bin-hadoop2.6/bin/spark-submit --class io.prediction.workflow.CreateWorkflow --jar/PredictionIO/PredictionIO-0.9.6/bin/MyRecommendation/target/scala-2.10/template-scala-parallel-recommendation_2.10-0.1-SNAPSHOT.jar,file:/home/PredictionIO/PredictionIO-0.9.6/bndation/target/scala-2.10/template-scala-parallel-recommendation-assembly-0.1-SNAPSHOT-deps.jar --files file:/home/PredictionIO/PredictionIO-0.9.6/conf/log4j.properties --driver/home/PredictionIO/PredictionIO-0.9.6/conf:/home/PredictionIO/PredictionIO-0.9.6/lib/postgresql-9.4-1204.jdbc41.jar:/home/PredictionIO/PredictionIO-0.9.6/lib/mysql-connector-jav file:/home/PredictionIO/PredictionIO-0.9.6/lib/pio-assembly-0.9.6.jar --engine-id qokYFr4rwibijNjabXeVSQKKFrACyrYZ --engine-version ed29b3e2074149d483aa85b6b1ea35a52dbbdb9a --et file:/home/PredictionIO/PredictionIO-0.9.6/bin/MyRecommendation/engine.json --verbosity 0 --json-extractor Both --env PIO_ENV_LOADED=1,PIO_STORAGE_REPOSITORIES_METADATA_NAME=pFS_BASEDIR=/root/.pio_store,PIO_HOME=/home/PredictionIO/PredictionIO-0.9.6,PIO_FS_ENGINESDIR=/root/.pio_store/engines,PIO_STORAGE_SOURCES_PGSQL_URL=jdbc:postgresql://localhost/pGE_REPOSITORIES_METADATA_SOURCE=PGSQL,PIO_STORAGE_REPOSITORIES_MODELDATA_SOURCE=PGSQL,PIO_STORAGE_REPOSITORIES_EVENTDATA_NAME=pio_event,PIO_STORAGE_SOURCES_PGSQL_PASSWORD=pio,PIURCES_PGSQL_TYPE=jdbc,PIO_FS_TMPDIR=/root/.pio_store/tmp,PIO_STORAGE_SOURCES_PGSQL_USERNAME=pio,PIO_STORAGE_REPOSITORIES_MODELDATA_NAME=pio_model,PIO_STORAGE_REPOSITORIES_EVENTDGSQL,PIO_CONF_DIR=/home/PredictionIO/PredictionIO-0.9.6/conf

[INFO] [Engine] Extracting datasource params...

[INFO] [WorkflowUtils$] No 'name' is found. Default empty String will be used.

[INFO] [Engine] Datasource params: (,DataSourceParams(MyApp3,None))

[INFO] [Engine] Extracting preparator params...

[INFO] [Engine] Preparator params: (,Empty)

[INFO] [Engine] Extracting serving params...

[INFO] [Engine] Serving params: (,Empty)

[WARN] [Utils] Your hostname, test-digin resolves to a loopback address: 127.0.1.1; using 192.168.2.191 instead (on interface p5p1)

[WARN] [Utils] Set SPARK_LOCAL_IP if you need to bind to another address

[INFO] [Remoting] Starting remoting

[INFO] [Remoting] Remoting started; listening on addresses :[akka.tcp://[email protected]:56574]

[WARN] [MetricsSystem] Using default name DAGScheduler for source because spark.app.id is not set.

[INFO] [Engine$] EngineWorkflow.train

[INFO] [Engine$] DataSource: [email protected]

[INFO] [Engine$] Preparator: [email protected]

[INFO] [Engine$] AlgorithmList: List([email protected])

[INFO] [Engine$] Data sanity check is on.

[INFO] [Engine$] duo.TrainingData does not support data sanity check. Skipping check.

[INFO] [Engine$] duo.PreparedData does not support data sanity check. Skipping check.

[WARN] [BLAS] Failed to load implementation from: com.github.fommil.netlib.NativeSystemBLAS

[WARN] [BLAS] Failed to load implementation from: com.github.fommil.netlib.NativeRefBLAS

[WARN] [LAPACK] Failed to load implementation from: com.github.fommil.netlib.NativeSystemLAPACK

[WARN] [LAPACK] Failed to load implementation from: com.github.fommil.netlib.NativeRefLAPACK

Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task serialization failed: java.lang.StackOverflowError

java.io.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:1028)

java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1496)

java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)

java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)

java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)

java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)

java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)

java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)

java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)

java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)

java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)

java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)

java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)

scala.collection.immutable.$colon$colon.writeObject(List.scala:379)

sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)

sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

java.lang.reflect.Method.invoke(Method.java:498)

java.io.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:1028)

java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1496)

java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)

java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)

java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)

java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)

java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)

java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)

java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)

java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)

java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)

java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)

java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)

java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)

java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)

java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)

java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)

java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)

java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)

java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)

java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)

what can I do to overcome this isssue?

1 Answer

0 votes
by (38.5k points)

Here is the solution for your problem

1.You have to reduce the numIterations parameter for algorithm in engine.json file in your prediction engine.

2.If reduction of numIteration does’nt work, then Add checkpointing, which prevents the recursion used by the codebase from creating an overflow. First of all, create a new directory to store the checkpoints. Then, have your SparkContext use that directory for checkpointing. Here is the example in Python:

sc.setCheckpointDir('checkpoint/') 

You may also need to add checkpointing to the ALS. To add a checkpoint there (probably not necessary), just do:

ALS.checkpointInterval = 2

If you want to be build successful data science career then enroll for best data science certification.

Welcome to Intellipaat Community. Get your technical queries answered by top developers !


Categories

...