Exception when training data in Predictionio

Question

asked Jul 3, 2019 in Data Science by sourav (17.6k points)

I am trying to Deploy an Recommendation Engine as mentioned in quick start guide. I completed the steps up to build the engine. Now I want to train the Recommendation Engine. I did as mentioned in quick start guide. (execute pio train). Then I got the lengthy error log and I couldn't paste all here. So I am putting first few rows of the error.

[INFO] [Console$] Using existing engine manifest JSON at /home/PredictionIO/PredictionIO-0.9.6/bin/MyRecommendation/manifest.json
[INFO] [Runner$] Submission command: /home/PredictionIO/PredictionIO-0.9.6/vendors/spark-1.5.1-bin-hadoop2.6/bin/spark-submit --class io.prediction.workflow.CreateWorkflow --jar/PredictionIO/PredictionIO-0.9.6/bin/MyRecommendation/target/scala-2.10/template-scala-parallel-recommendation_2.10-0.1-SNAPSHOT.jar,file:/home/PredictionIO/PredictionIO-0.9.6/bndation/target/scala-2.10/template-scala-parallel-recommendation-assembly-0.1-SNAPSHOT-deps.jar --files file:/home/PredictionIO/PredictionIO-0.9.6/conf/log4j.properties --driver/home/PredictionIO/PredictionIO-0.9.6/conf:/home/PredictionIO/PredictionIO-0.9.6/lib/postgresql-9.4-1204.jdbc41.jar:/home/PredictionIO/PredictionIO-0.9.6/lib/mysql-connector-jav file:/home/PredictionIO/PredictionIO-0.9.6/lib/pio-assembly-0.9.6.jar --engine-id qokYFr4rwibijNjabXeVSQKKFrACyrYZ --engine-version ed29b3e2074149d483aa85b6b1ea35a52dbbdb9a --et file:/home/PredictionIO/PredictionIO-0.9.6/bin/MyRecommendation/engine.json --verbosity 0 --json-extractor Both --env PIO_ENV_LOADED=1,PIO_STORAGE_REPOSITORIES_METADATA_NAME=pFS_BASEDIR=/root/.pio_store,PIO_HOME=/home/PredictionIO/PredictionIO-0.9.6,PIO_FS_ENGINESDIR=/root/.pio_store/engines,PIO_STORAGE_SOURCES_PGSQL_URL=jdbc:postgresql://localhost/pGE_REPOSITORIES_METADATA_SOURCE=PGSQL,PIO_STORAGE_REPOSITORIES_MODELDATA_SOURCE=PGSQL,PIO_STORAGE_REPOSITORIES_EVENTDATA_NAME=pio_event,PIO_STORAGE_SOURCES_PGSQL_PASSWORD=pio,PIURCES_PGSQL_TYPE=jdbc,PIO_FS_TMPDIR=/root/.pio_store/tmp,PIO_STORAGE_SOURCES_PGSQL_USERNAME=pio,PIO_STORAGE_REPOSITORIES_MODELDATA_NAME=pio_model,PIO_STORAGE_REPOSITORIES_EVENTDGSQL,PIO_CONF_DIR=/home/PredictionIO/PredictionIO-0.9.6/conf
[INFO] [Engine] Extracting datasource params...
[INFO] [WorkflowUtils$] No 'name' is found. Default empty String will be used.
[INFO] [Engine] Datasource params: (,DataSourceParams(MyApp3,None))
[INFO] [Engine] Extracting preparator params...
[INFO] [Engine] Preparator params: (,Empty)
[INFO] [Engine] Extracting serving params...
[INFO] [Engine] Serving params: (,Empty)
[WARN] [Utils] Your hostname, test-digin resolves to a loopback address: 127.0.1.1; using 192.168.2.191 instead (on interface p5p1)
[WARN] [Utils] Set SPARK_LOCAL_IP if you need to bind to another address
[INFO] [Remoting] Starting remoting
[INFO] [Remoting] Remoting started; listening on addresses :[akka.tcp://[email protected]:56574]
[WARN] [MetricsSystem] Using default name DAGScheduler for source because spark.app.id is not set.
[INFO] [Engine$] EngineWorkflow.train
[INFO] [Engine$] DataSource: duo.DataSource@6088451e
[INFO] [Engine$] Preparator: duo.Preparator@1642eeae
[INFO] [Engine$] AlgorithmList: List(duo.ALSAlgorithm@a09303)
[INFO] [Engine$] Data sanity check is on.
[INFO] [Engine$] duo.TrainingData does not support data sanity check. Skipping check.
[INFO] [Engine$] duo.PreparedData does not support data sanity check. Skipping check.
[WARN] [BLAS] Failed to load implementation from: com.github.fommil.netlib.NativeSystemBLAS
[WARN] [BLAS] Failed to load implementation from: com.github.fommil.netlib.NativeRefBLAS
[WARN] [LAPACK] Failed to load implementation from: com.github.fommil.netlib.NativeSystemLAPACK
[WARN] [LAPACK] Failed to load implementation from: com.github.fommil.netlib.NativeRefLAPACK
Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task serialization failed: java.lang.StackOverflowError
java.io.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:1028)
java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1496)
java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
scala.collection.immutable.$colon$colon.writeObject(List.scala:379)
sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
java.lang.reflect.Method.invoke(Method.java:498)
java.io.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:1028)
java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1496)
java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)

what can I do to overcome this isssue?

1 Answer

Shlok Pandey · Answer 1 · 2019-07-15T08:24:21+0000

Here is the solution for your problem

1.You have to reduce the numIterations parameter for algorithm in engine.json file in your prediction engine.

2.If reduction of numIteration does’nt work, then Add checkpointing, which prevents the recursion used by the codebase from creating an overflow. First of all, create a new directory to store the checkpoints. Then, have your SparkContext use that directory for checkpointing. Here is the example in Python:

sc.setCheckpointDir('checkpoint/')

You may also need to add checkpointing to the ALS. To add a checkpoint there (probably not necessary), just do:

ALS.checkpointInterval = 2

To get your master's degree in Data Science with job assistance. Enroll in the MSc in Data Science in Philippines!

Exception when training data in Predictionio

Exception when training data in Predictionio

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Related questions

Browse Categories

Popular Courses

Top Tutorials

Top Articles

Top Interview Questions