0 votes
1 view
in Big Data Hadoop & Spark by (11.5k points)

In my spark-shell, what do entries like the below mean when I execute a function ?

[Stage7:===========>                              (14174 + 5) / 62500]

1 Answer

0 votes
by (32.5k points)
edited by

In your Console Progress Bar, Stage 7: shows the stage you are in now, and “(14174 + 5) / 62500” is numCompletedTasks + numActiveTasks / totalNumOfTasksInThisStage, respectively. The progress bar shows numCompletedTasks / totalNumOfTasksInThisStage.

Let me make you understand more clearly with an example:

Assume that you are seeing the following (X,M,N,O are always non negative integers):

[Stage X:==========>            (P + Q) / R]

(for example in the question X=7, P=14174, Q=5 and R=62500)

Let me explain: During each Stage Spark breaks the work in stages and tasks. This progress indicator means that Stage X is comprised of R tasks. During the execution, P and Q start at zero and keep changing. P is always the number of tasks already finished and Q is the number of tasks currently executing. For a stage with many tasks (way more than the workers you have) you should expect to see Q grow to a number that corresponds to how many workers you have in the cluster, then you should start seeing P increase as tasks complete. Towards the end, as the last few tasks execute, Q will start decreasing until it reaches 0, at which point P should equal R, the stage is done, and spark moves to the next stage. R will stay constant during the whole time, remember it is the total number of tasks in the stage and never changes.

The ====> shows the percentage of work done based on what I described above. At the beginning the > will be towards the left and will be moving to the right as tasks are completed.

It will be shown when both spark.ui.showConsoleProgress is true (by default) and log level in conf/log4j.properties is ERROR or WARN (!log.isInfoEnabled is true).

If you want to know more about Spark, then do check out this awesome video tutorial:

Welcome to Intellipaat Community. Get your technical queries answered by top developers !